Paper Group ANR 436
Estimating Dynamic Conditional Spread Densities to Optimise Daily Storage Trading of Electricity. Towards Unsupervised Cancer Subtyping: Predicting Prognosis Using A Histologic Visual Dictionary. Boolean matrix factorization meets consecutive ones property. Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics. Neural Forest Le …
Estimating Dynamic Conditional Spread Densities to Optimise Daily Storage Trading of Electricity
Title | Estimating Dynamic Conditional Spread Densities to Optimise Daily Storage Trading of Electricity |
Authors | Ekaterina Abramova, Derek Bunn |
Abstract | This paper formulates dynamic density functions, based upon skewed-t and similar representations, to model and forecast electricity price spreads between different hours of the day. This supports an optimal day ahead storage and discharge schedule, and thereby facilitates a bidding strategy for a merchant arbitrage facility into the day-ahead auctions for wholesale electricity. The four latent moments of the density functions are dynamic and conditional upon exogenous drivers, thereby permitting the mean, variance, skewness and kurtosis of the densities to respond hourly to such factors as weather and demand forecasts. The best specification for each spread is selected based on the Pinball Loss function, following the closed form analytical solutions of the cumulative density functions. Those analytical properties also allow the calculation of risk associated with the spread arbitrages. From these spread densities, the optimal daily operation of a battery storage facility is determined. |
Tasks | |
Published | 2019-03-09 |
URL | http://arxiv.org/abs/1903.06668v1 |
http://arxiv.org/pdf/1903.06668v1.pdf | |
PWC | https://paperswithcode.com/paper/estimating-dynamic-conditional-spread |
Repo | |
Framework | |
Towards Unsupervised Cancer Subtyping: Predicting Prognosis Using A Histologic Visual Dictionary
Title | Towards Unsupervised Cancer Subtyping: Predicting Prognosis Using A Histologic Visual Dictionary |
Authors | Hassan Muhammad, Carlie S. Sigel, Gabriele Campanella, Thomas Boerner, Linda M. Pak, Stefan Büttner, Jan N. M. IJzermans, Bas Groot Koerkamp, Michael Doukas, William R. Jarnagin, Amber Simpson, Thomas J. Fuchs |
Abstract | Unlike common cancers, such as those of the prostate and breast, tumor grading in rare cancers is difficult and largely undefined because of small sample sizes, the sheer volume of time needed to undertake on such a task, and the inherent difficulty of extracting human-observed patterns. One of the most challenging examples is intrahepatic cholangiocarcinoma (ICC), a primary liver cancer arising from the biliary system, for which there is well-recognized tumor heterogeneity and no grading paradigm or prognostic biomarkers. In this paper, we propose a new unsupervised deep convolutional autoencoder-based clustering model that groups together cellular and structural morphologies of tumor in 246 ICC digitized whole slides, based on visual similarity. From this visual dictionary of histologic patterns, we use the clusters as covariates to train Cox-proportional hazard survival models. In univariate analysis, three clusters were significantly associated with recurrence-free survival. Combinations of these clusters were significant in multivariate analysis. In a multivariate analysis of all clusters, five showed significance to recurrence-free survival, however the overall model was not measured to be significant. Finally, a pathologist assigned clinical terminology to the significant clusters in the visual dictionary and found evidence supporting the hypothesis that collagen-enriched fibrosis plays a role in disease severity. These results offer insight into the future of cancer subtyping and show that computational pathology can contribute to disease prognostication, especially in rare cancers. |
Tasks | |
Published | 2019-03-12 |
URL | http://arxiv.org/abs/1903.05257v1 |
http://arxiv.org/pdf/1903.05257v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-unsupervised-cancer-subtyping |
Repo | |
Framework | |
Boolean matrix factorization meets consecutive ones property
Title | Boolean matrix factorization meets consecutive ones property |
Authors | Nikolaj Tatti, Pauli Miettinen |
Abstract | Boolean matrix factorization is a natural and a popular technique for summarizing binary matrices. In this paper, we study a problem of Boolean matrix factorization where we additionally require that the factor matrices have consecutive ones property (OBMF). A major application of this optimization problem comes from graph visualization: standard techniques for visualizing graphs are circular or linear layout, where nodes are ordered in circle or on a line. A common problem with visualizing graphs is clutter due to too many edges. The standard approach to deal with this is to bundle edges together and represent them as ribbon. We also show that we can use OBMF for edge bundling combined with circular or linear layout techniques. We demonstrate that not only this problem is NP-hard but we cannot have a polynomial-time algorithm that yields a multiplicative approximation guarantee (unless P = NP). On the positive side, we develop a greedy algorithm where at each step we look for the best 1-rank factorization. Since even obtaining 1-rank factorization is NP-hard, we propose an iterative algorithm where we fix one side and and find the other, reverse the roles, and repeat. We show that this step can be done in linear time using pq-trees. We also extend the problem to cyclic ones property and symmetric factorizations. Our experiments show that our algorithms find high-quality factorizations and scale well. |
Tasks | |
Published | 2019-01-17 |
URL | http://arxiv.org/abs/1901.05797v1 |
http://arxiv.org/pdf/1901.05797v1.pdf | |
PWC | https://paperswithcode.com/paper/boolean-matrix-factorization-meets |
Repo | |
Framework | |
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics
Title | Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics |
Authors | Denis Steckelmacher, Hélène Plisnier, Diederik M. Roijers, Ann Nowé |
Abstract | Value-based reinforcement-learning algorithms provide state-of-the-art results in model-free discrete-action settings, and tend to outperform actor-critic algorithms. We argue that actor-critic algorithms are limited by their need for an on-policy critic. We propose Bootstrapped Dual Policy Iteration (BDPI), a novel model-free reinforcement-learning algorithm for continuous states and discrete actions, with an actor and several off-policy critics. Off-policy critics are compatible with experience replay, ensuring high sample-efficiency, without the need for off-policy corrections. The actor, by slowly imitating the average greedy policy of the critics, leads to high-quality and state-specific exploration, which we compare to Thompson sampling. Because the actor and critics are fully decoupled, BDPI is remarkably stable, and unusually robust to its hyper-parameters. BDPI is significantly more sample-efficient than Bootstrapped DQN, PPO, and ACKTR, on discrete, continuous and pixel-based tasks. Source code: https://github.com/vub-ai-lab/bdpi. |
Tasks | |
Published | 2019-03-11 |
URL | https://arxiv.org/abs/1903.04193v2 |
https://arxiv.org/pdf/1903.04193v2.pdf | |
PWC | https://paperswithcode.com/paper/sample-efficient-model-free-reinforcement |
Repo | |
Framework | |
Neural Forest Learning
Title | Neural Forest Learning |
Authors | Yun-Hao Cao, Jianxin Wu |
Abstract | We propose Neural Forest Learning (NFL), a novel deep learning based random-forest-like method. In contrast to previous forest methods, NFL enjoys the benefits of end-to-end, data-driven representation learning, as well as pervasive support from deep learning software and hardware platforms, hence achieving faster inference speed and higher accuracy than previous forest methods. Furthermore, NFL learns non-linear feature representations in CNNs more efficiently than previous higher-order pooling methods, producing good results with negligible increase in parameters, floating point operations (FLOPs) and real running time. We achieve superior performance on 7 machine learning datasets when compared to random forests and GBDTs. On the fine-grained benchmarks CUB-200-2011, FGVC-aircraft and Stanford Cars, we achieve over 5.7%, 6.9% and 7.8% gains for VGG-16, respectively. Moreover, NFL can converge in much fewer epochs, further accelerating network training. On the large-scale ImageNet ILSVRC-12 validation set, integration of NFL into ResNet-18 achieves top-1/top-5 errors of 28.32%/9.77%, which outperforms ResNet-18 by 1.92%/1.15% with negligible extra cost and the improvement is consistent under various architectures. |
Tasks | Representation Learning |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07845v1 |
https://arxiv.org/pdf/1911.07845v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-forest-learning |
Repo | |
Framework | |
All Sparse PCA Models Are Wrong, But Some Are Useful. Part I: Computation of Scores, Residuals and Explained Variance
Title | All Sparse PCA Models Are Wrong, But Some Are Useful. Part I: Computation of Scores, Residuals and Explained Variance |
Authors | J. Camacho, A. K. Smilde, E. Saccenti, J. A. Westerhuis |
Abstract | Sparse Principal Component Analysis (sPCA) is a popular matrix factorization approach based on Principal Component Analysis (PCA) that combines variance maximization and sparsity with the ultimate goal of improving data interpretation. When moving from PCA to sPCA, there are a number of implications that the practitioner needs to be aware of. A relevant one is that scores and loadings in sPCA may not be orthogonal. For this reason, the traditional way of computing scores, residuals and variance explained that is used in the classical PCA cannot directly be applied to sPCA models. This also affects how sPCA components should be visualized. In this paper we illustrate this problem both theoretically and numerically using simulations for several state-of-the-art sPCA algorithms, and provide proper computation of the different elements mentioned. We show that sPCA approaches present disparate and limited performance when modeling noise-free, sparse data. In a follow-up paper, we discuss the theoretical properties that lead to this problem. |
Tasks | |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.03989v1 |
https://arxiv.org/pdf/1907.03989v1.pdf | |
PWC | https://paperswithcode.com/paper/all-sparse-pca-models-are-wrong-but-some-are |
Repo | |
Framework | |
Transferable Semi-supervised 3D Object Detection from RGB-D Data
Title | Transferable Semi-supervised 3D Object Detection from RGB-D Data |
Authors | Yew Siang Tang, Gim Hee Lee |
Abstract | We investigate the direction of training a 3D object detector for new object classes from only 2D bounding box labels of these new classes, while simultaneously transferring information from 3D bounding box labels of the existing classes. To this end, we propose a transferable semi-supervised 3D object detection model that learns a 3D object detector network from training data with two disjoint sets of object classes - a set of strong classes with both 2D and 3D box labels, and another set of weak classes with only 2D box labels. In particular, we suggest a relaxed reprojection loss, box prior loss and a Box-to-Point Cloud Fit network that allow us to effectively transfer useful 3D information from the strong classes to the weak classes during training, and consequently, enable the network to detect 3D objects in the weak classes during inference. Experimental results show that our proposed algorithm outperforms baseline approaches and achieves promising results compared to fully-supervised approaches on the SUN-RGBD and KITTI datasets. Furthermore, we show that our Box-to-Point Cloud Fit network improves performances of the fully-supervised approaches on both datasets. |
Tasks | 3D Object Detection, Object Detection |
Published | 2019-04-23 |
URL | http://arxiv.org/abs/1904.10300v1 |
http://arxiv.org/pdf/1904.10300v1.pdf | |
PWC | https://paperswithcode.com/paper/transferable-semi-supervised-3d-object |
Repo | |
Framework | |
Single neuron-based neural networks are as efficient as dense deep neural networks in binary and multi-class recognition problems
Title | Single neuron-based neural networks are as efficient as dense deep neural networks in binary and multi-class recognition problems |
Authors | Yassin Khalifa, Justin Hawks, Ervin Sejdic |
Abstract | Recent advances in neuroscience have revealed many principles about neural processing. In particular, many biological systems were found to reconfigure/recruit single neurons to generate multiple kinds of decisions. Such findings have the potential to advance our understanding of the design and optimization process of artificial neural networks. Previous work demonstrated that dense neural networks are needed to shape complex decision surfaces required for AI-level recognition tasks. We investigate the ability to model high dimensional recognition problems using single or several neurons networks that are relatively easier to train. By employing three datasets, we test the use of a population of single neuron networks in performing multi-class recognition tasks. Surprisingly, we find that sparse networks can be as efficient as dense networks in both binary and multi-class tasks. Moreover, single neuron networks demonstrate superior performance in binary classification scheme and competing results when combined for multi-class recognition. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.12135v1 |
https://arxiv.org/pdf/1905.12135v1.pdf | |
PWC | https://paperswithcode.com/paper/single-neuron-based-neural-networks-are-as |
Repo | |
Framework | |
Disentangling trainability and generalization in deep learning
Title | Disentangling trainability and generalization in deep learning |
Authors | Lechao Xiao, Jeffrey Pennington, Samuel S. Schoenholz |
Abstract | A fundamental goal in deep learning is the characterization of trainability and generalization of neural networks as a function of their architecture and hyperparameters. In this paper, we discuss these challenging issues in the context of wide neural networks at large depths where we will see that the situation simplifies considerably. To do this, we leverage recent advances that have separately shown: (1) that in the wide network limit, random networks before training are Gaussian Processes governed by a kernel known as the Neural Network Gaussian Process (NNGP) kernel, (2) that at large depths the spectrum of the NNGP kernel simplifies considerably and becomes “weakly data-dependent” and (3) that gradient descent training of wide neural networks is described by a kernel called the Neural Tangent Kernel (NTK) that is related to the NNGP. Here we show that in the large depth limit the spectrum of the NTK simplifies in much the same way as that of the NNGP kernel. By analyzing this spectrum, we arrive at a precise characterization of trainability and a necessary condition for generalization across a range of architectures including Fully Connected Networks (FCNs) and Convolutional Neural Networks (CNNs). In particular, we find that there are large regions of hyperparameter space where networks can only memorize the training set in the sense they reach perfect training accuracy but completely fail to generalize outside the training set, in contrast with several recent results. By comparing CNNs with- and without-global average pooling, we show that CNNs without average pooling have very nearly identical learning dynamics to FCNs while CNNs with pooling contain a correction that alters its generalization performance. We perform a thorough empirical investigation of these theoretical results and finding excellent agreement on real datasets. |
Tasks | Gaussian Processes |
Published | 2019-12-30 |
URL | https://arxiv.org/abs/1912.13053v1 |
https://arxiv.org/pdf/1912.13053v1.pdf | |
PWC | https://paperswithcode.com/paper/disentangling-trainability-and-generalization-1 |
Repo | |
Framework | |
Noise Analysis of Photonic Modulator Neurons
Title | Noise Analysis of Photonic Modulator Neurons |
Authors | Thomas Ferreira de Lima, Alexander N. Tait, Hooman Saeidi, Mitchell A. Nahmias, Hsuan-Tung Peng, Siamak Abbaslou, Bhavin J. Shastri, Paul R. Prucnal |
Abstract | Neuromorphic photonics relies on efficiently emulating analog neural networks at high speeds. Prior work showed that transducing signals from the optical to the electrical domain and back with transimpedance gain was an efficient approach to implementing analog photonic neurons and scalable networks. Here, we examine modulator-based photonic neuron circuits with passive and active transimpedance gains, with special attention to the sources of noise propagation. We find that a modulator nonlinear transfer function can suppress noise, which is necessary to avoid noise propagation in hardware neural networks. In addition, while efficient modulators can reduce power for an individual neuron, signal-to-noise ratios must be traded off with power consumption at a system level. Active transimpedance amplifiers may help relax this tradeoff for conventional p-n junction silicon photonic modulators, but a passive transimpedance circuit is sufficient when very efficient modulators (i.e. low C and low V-pi) are employed. |
Tasks | |
Published | 2019-07-17 |
URL | https://arxiv.org/abs/1907.07325v1 |
https://arxiv.org/pdf/1907.07325v1.pdf | |
PWC | https://paperswithcode.com/paper/noise-analysis-of-photonic-modulator-neurons |
Repo | |
Framework | |
Bayesian Tensor Factorisation for Bottom-up Hidden Tree Markov Models
Title | Bayesian Tensor Factorisation for Bottom-up Hidden Tree Markov Models |
Authors | Daniele Castellana, Davide Bacciu |
Abstract | Bottom-Up Hidden Tree Markov Model is a highly expressive model for tree-structured data. Unfortunately, it cannot be used in practice due to the intractable size of its state-transition matrix. We propose a new approximation which lies on the Tucker factorisation of tensors. The probabilistic interpretation of such approximation allows us to define a new probabilistic model for tree-structured data. Hence, we define the new approximated model and we derive its learning algorithm. Then, we empirically assess the effective power of the new model evaluating it on two different tasks. In both cases, our model outperforms the other approximated model known in the literature. |
Tasks | |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13528v1 |
https://arxiv.org/pdf/1905.13528v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-tensor-factorisation-for-bottom-up |
Repo | |
Framework | |
CollaGAN : Collaborative GAN for Missing Image Data Imputation
Title | CollaGAN : Collaborative GAN for Missing Image Data Imputation |
Authors | Dongwook Lee, Junyoung Kim, Won-Jin Moon, Jong Chul Ye |
Abstract | In many applications requiring multiple inputs to obtain a desired output, if any of the input data is missing, it often introduces large amounts of bias. Although many techniques have been developed for imputing missing data, the image imputation is still difficult due to complicated nature of natural images. To address this problem, here we proposed a novel framework for missing image data imputation, called Collaborative Generative Adversarial Network (CollaGAN). CollaGAN converts an image imputation problem to a multi-domain images-to-image translation task so that a single generator and discriminator network can successfully estimate the missing data using the remaining clean data set. We demonstrate that CollaGAN produces the images with a higher visual quality compared to the existing competing approaches in various image imputation tasks. |
Tasks | Image Imputation, Imputation |
Published | 2019-01-28 |
URL | http://arxiv.org/abs/1901.09764v3 |
http://arxiv.org/pdf/1901.09764v3.pdf | |
PWC | https://paperswithcode.com/paper/collagan-collaborative-gan-for-missing-image |
Repo | |
Framework | |
Single-Frame Super-Resolution of Solar Magnetograms: Investigating Physics-Based Metrics & Losses
Title | Single-Frame Super-Resolution of Solar Magnetograms: Investigating Physics-Based Metrics & Losses |
Authors | Anna Jungbluth, Xavier Gitiaux, Shane A. Maloney, Carl Shneider, Paul J. Wright, Alfredo Kalaitzis, Michel Deudon, Atılım Güneş Baydin, Yarin Gal, Andrés Muñoz-Jaramillo |
Abstract | Breakthroughs in our understanding of physical phenomena have traditionally followed improvements in instrumentation. Studies of the magnetic field of the Sun, and its influence on the solar dynamo and space weather events, have benefited from improvements in resolution and measurement frequency of new instruments. However, in order to fully understand the solar cycle, high-quality data across time-scales longer than the typical lifespan of a solar instrument are required. At the moment, discrepancies between measurement surveys prevent the combined use of all available data. In this work, we show that machine learning can help bridge the gap between measurement surveys by learning to \textbf{super-resolve} low-resolution magnetic field images and \textbf{translate} between characteristics of contemporary instruments in orbit. We also introduce the notion of physics-based metrics and losses for super-resolution to preserve underlying physics and constrain the solution space of possible super-resolution outputs. |
Tasks | Super-Resolution |
Published | 2019-11-04 |
URL | https://arxiv.org/abs/1911.01490v1 |
https://arxiv.org/pdf/1911.01490v1.pdf | |
PWC | https://paperswithcode.com/paper/single-frame-super-resolution-of-solar |
Repo | |
Framework | |
Unsupervised Curricula for Visual Meta-Reinforcement Learning
Title | Unsupervised Curricula for Visual Meta-Reinforcement Learning |
Authors | Allan Jabri, Kyle Hsu, Ben Eysenbach, Abhishek Gupta, Sergey Levine, Chelsea Finn |
Abstract | In principle, meta-reinforcement learning algorithms leverage experience across many tasks to learn fast reinforcement learning (RL) strategies that transfer to similar tasks. However, current meta-RL approaches rely on manually-defined distributions of training tasks, and hand-crafting these task distributions can be challenging and time-consuming. Can “useful” pre-training tasks be discovered in an unsupervised manner? We develop an unsupervised algorithm for inducing an adaptive meta-training task distribution, i.e. an automatic curriculum, by modeling unsupervised interaction in a visual environment. The task distribution is scaffolded by a parametric density model of the meta-learner’s trajectory distribution. We formulate unsupervised meta-RL as information maximization between a latent task variable and the meta-learner’s data distribution, and describe a practical instantiation which alternates between integration of recent experience into the task distribution and meta-learning of the updated tasks. Repeating this procedure leads to iterative reorganization such that the curriculum adapts as the meta-learner’s data distribution shifts. In particular, we show how discriminative clustering for visual representation can support trajectory-level task acquisition and exploration in domains with pixel observations, avoiding pitfalls of alternatives. In experiments on vision-based navigation and manipulation domains, we show that the algorithm allows for unsupervised meta-learning that transfers to downstream tasks specified by hand-crafted reward functions and serves as pre-training for more efficient supervised meta-learning of test task distributions. |
Tasks | Meta-Learning |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.04226v1 |
https://arxiv.org/pdf/1912.04226v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-curricula-for-visual-meta-1 |
Repo | |
Framework | |
Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling
Title | Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling |
Authors | Nasser Zalmout, Nizar Habash |
Abstract | Morphological tagging is challenging for morphologically rich languages due to the large target space and the need for more training data to minimize model sparsity. Dialectal variants of morphologically rich languages suffer more as they tend to be more noisy and have less resources. In this paper we explore the use of multitask learning and adversarial training to address morphological richness and dialectal variations in the context of full morphological tagging. We use multitask learning for joint morphological modeling for the features within two dialects, and as a knowledge-transfer scheme for cross-dialectal modeling. We use adversarial training to learn dialect invariant features that can help the knowledge-transfer scheme from the high to low-resource variants. We work with two dialectal variants: Modern Standard Arabic (high-resource “dialect”) and Egyptian Arabic (low-resource dialect) as a case study. Our models achieve state-of-the-art results for both. Furthermore, adversarial training provides more significant improvement when using smaller training datasets in particular. |
Tasks | Morphological Tagging, Transfer Learning |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12702v1 |
https://arxiv.org/pdf/1910.12702v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-multitask-learning-for-joint-1 |
Repo | |
Framework | |