Paper Group ANR 1411
Parameter Free Clustering with Cluster Catch Digraphs (Technical Report). Communication-Efficient Integrative Regression in High-Dimensions. Harnessing Low-Fidelity Data to Accelerate Bayesian Optimization via Posterior Regularization. A Probabilistic framework for Quantum Clustering. Influence-aware Memory for Deep Reinforcement Learning. Generali …
Parameter Free Clustering with Cluster Catch Digraphs (Technical Report)
Title | Parameter Free Clustering with Cluster Catch Digraphs (Technical Report) |
Authors | Artür Manukyan, Elvan Ceyhan |
Abstract | We propose clustering algorithms based on a recently developed geometric digraph family called cluster catch digraphs (CCDs). These digraphs are used to devise clustering methods that are hybrids of density-based and graph-based clustering methods. CCDs are appealing digraphs for clustering, since they estimate the number of clusters; however, CCDs (and density-based methods in general) require some information on a parameter representing the \emph{intensity} of assumed clusters in the data set. We propose algorithms that are parameter free versions of the CCD algorithm and does not require a specification of the intensity parameter whose choice is often critical in finding an optimal partitioning of the data set. We estimate the number of convex clusters by borrowing a tool from spatial data analysis, namely Ripley’s $K$ function. We call our new digraphs utilizing the $K$ function as RK-CCDs. We show that the minimum dominating sets of RK-CCDs estimate and distinguish the clusters from noise clusters in a data set, and hence allow the estimation of the correct number of clusters. Our robust clustering algorithms are comprised of methods that estimate both the number of clusters and the intensity parameter, making them completely parameter free. We conduct Monte Carlo simulations and use real life data sets to compare RK-CCDs with some commonly used density-based and prototype-based clustering methods. |
Tasks | |
Published | 2019-12-26 |
URL | https://arxiv.org/abs/1912.11926v1 |
https://arxiv.org/pdf/1912.11926v1.pdf | |
PWC | https://paperswithcode.com/paper/parameter-free-clustering-with-cluster-catch |
Repo | |
Framework | |
Communication-Efficient Integrative Regression in High-Dimensions
Title | Communication-Efficient Integrative Regression in High-Dimensions |
Authors | Subha Maity, Yuekai Sun, Moulinath Banerjee |
Abstract | We consider the task of meta-analysis in high-dimensional settings in which the data sources we wish to integrate are similar but non-identical. To borrow strength across such heterogeneous data sources, we introduce a global parameter that addresses several identification issues. We also propose a one-shot estimator of the global parameter that preserves the anonymity of the data sources and converges at a rate that depends on the size of the combined dataset. Finally, we demonstrate the benefits of our approach on a large-scale drug treatment dataset involving several different cancer cell lines. |
Tasks | |
Published | 2019-12-26 |
URL | https://arxiv.org/abs/1912.11928v1 |
https://arxiv.org/pdf/1912.11928v1.pdf | |
PWC | https://paperswithcode.com/paper/communication-efficient-integrative |
Repo | |
Framework | |
Harnessing Low-Fidelity Data to Accelerate Bayesian Optimization via Posterior Regularization
Title | Harnessing Low-Fidelity Data to Accelerate Bayesian Optimization via Posterior Regularization |
Authors | Bin Liu |
Abstract | Bayesian optimization (BO) is a powerful paradigm for derivative-free global optimization of a black-box objective function (BOF) that is expensive to evaluate. However, the overhead of BO can still be prohibitive for problems with highly expensive function evaluations. In this paper, we investigate how to reduce the required number of function evaluations for BO without compromise in solution quality. We explore the idea of posterior regularization to harness low fidelity (LF) data within the Gaussian process upper confidence bound (GP-UCB) framework. The LF data can arise from previous evaluations of an LF approximation of the BOF or of a related optimization task. An extra GP model called LF-GP is trained to fit the LF data. We develop an operator termed dynamic weighted product of experts (DW-POE) fusion. The regularization is induced by this operator on the posterior of the BOF. The impact of the LF GP model on the resulting regularized posterior is adaptively adjusted via Bayesian formalism. Extensive experimental results on benchmark BOF optimization tasks demonstrate the superior performance of the proposed algorithm over state-of-the-art. |
Tasks | |
Published | 2019-02-11 |
URL | https://arxiv.org/abs/1902.03740v5 |
https://arxiv.org/pdf/1902.03740v5.pdf | |
PWC | https://paperswithcode.com/paper/harnessing-low-fidelity-data-to-accelerate |
Repo | |
Framework | |
A Probabilistic framework for Quantum Clustering
Title | A Probabilistic framework for Quantum Clustering |
Authors | Raúl V. Casaña-Eslava, Paulo J. G. Lisboa, Sandra Ortega-Martorell, Ian H. Jarman, José D. Martín-Guerrero |
Abstract | Quantum Clustering is a powerful method to detect clusters in data with mixed density. However, it is very sensitive to a length parameter that is inherent to the Schr"odinger equation. In addition, linking data points into clusters requires local estimates of covariance that are also controlled by length parameters. This raises the question of how to adjust the control parameters of the Schr"odinger equation for optimal clustering. We propose a probabilistic framework that provides an objective function for the goodness-of-fit to the data, enabling the control parameters to be optimised within a Bayesian framework. This naturally yields probabilities of cluster membership and data partitions with specific numbers of clusters. The proposed framework is tested on real and synthetic data sets, assessing its validity by measuring concordance with known data structure by means of the Jaccard score (JS). This work also proposes an objective way to measure performance in unsupervised learning that correlates very well with JS. |
Tasks | |
Published | 2019-02-14 |
URL | http://arxiv.org/abs/1902.05578v1 |
http://arxiv.org/pdf/1902.05578v1.pdf | |
PWC | https://paperswithcode.com/paper/a-probabilistic-framework-for-quantum |
Repo | |
Framework | |
Influence-aware Memory for Deep Reinforcement Learning
Title | Influence-aware Memory for Deep Reinforcement Learning |
Authors | Miguel Suau, Elena Congeduti, Rolf Starre, Aleksander Czechowski, Frans Olihoek |
Abstract | Making the right decisions when some of the state variables are hidden, involves reasoning about all the possible states of the environment. An agent receiving only partial observations needs to infer the true values of these hidden variables based on the history of experiences. Recent deep reinforcement learning methods use recurrent models to keep track of past information. However, these models are sometimes expensive to train and have convergence difficulties, especially when dealing with high dimensional input spaces. Taking inspiration from influence-based abstraction, we show that effective policies can be learned in the presence of uncertainty by only memorizing a small subset of input variables. We also incorporate a mechanism in our network that learns to automatically choose the important pieces of information that need to be remembered. The results indicate that, by forcing the agent’s internal memory to focus on the selected regions while treating the rest of the observable variables as Markovian, we can outperform ordinary recurrent architectures in situations where the amount of information that the agent needs to retain represents a small fraction of the entire observation input. The method also reduces training time and obtains better scores than methods that stack multiple observations to remove partial observability in domains where long-term memory is required. |
Tasks | |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07643v2 |
https://arxiv.org/pdf/1911.07643v2.pdf | |
PWC | https://paperswithcode.com/paper/influence-aware-memory-for-deep-reinforcement-1 |
Repo | |
Framework | |
Generalizing Information to the Evolution of Rational Belief
Title | Generalizing Information to the Evolution of Rational Belief |
Authors | Jed A. Duersch, Thomas A. Catanach |
Abstract | Information theory provides a mathematical foundation to measure uncertainty in belief. Belief is represented by a probability distribution that captures our understanding of an outcome’s plausibility. Information measures based on Shannon’s concept of entropy include realization information, Kullback-Leibler divergence, Lindley’s information in experiment, cross entropy, and mutual information. We derive a general theory of information from first principles that accounts for evolving belief and recovers all of these measures. Rather than simply gauging uncertainty, information is understood in this theory to measure change in belief. We may then regard entropy as the information we expect to gain upon realization of a discrete latent random variable. This theory of information is compatible with the Bayesian paradigm in which rational belief is updated as evidence becomes available. Furthermore, this theory admits novel measures of information with well-defined properties, which we explore in both analysis and experiment. This view of information illuminates the study of machine learning by allowing us to quantify information captured by a predictive model and distinguish it from residual information contained in training data. We gain related insights regarding feature selection, anomaly detection, and novel Bayesian approaches. |
Tasks | Anomaly Detection, Feature Selection |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09559v2 |
https://arxiv.org/pdf/1911.09559v2.pdf | |
PWC | https://paperswithcode.com/paper/generalizing-information-to-the-evolution-of |
Repo | |
Framework | |
Matlab vs. OpenCV: A Comparative Study of Different Machine Learning Algorithms
Title | Matlab vs. OpenCV: A Comparative Study of Different Machine Learning Algorithms |
Authors | Ahmed A. Elsayed, Waleed A. Yousef |
Abstract | Scientific Computing relies on executing computer algorithms coded in some programming languages. Given a particular available hardware, algorithms speed is a crucial factor. There are many scientific computing environments used to code such algorithms. Matlab is one of the most tremendously successful and widespread scientific computing environments that is rich of toolboxes, libraries, and data visualization tools. OpenCV is a (C++)-based library written primarily for Computer Vision and its related areas. This paper presents a comparative study using 20 different real datasets to compare the speed of Matlab and OpenCV for some Machine Learning algorithms. Although Matlab is more convenient in developing and data presentation, OpenCV is much faster in execution, where the speed ratio reaches more than 80 in some cases. The best of two worlds can be achieved by exploring using Matlab or similar environments to select the most successful algorithm; then, implementing the selected algorithm using OpenCV or similar environments to gain a speed factor. |
Tasks | |
Published | 2019-05-03 |
URL | https://arxiv.org/abs/1905.01213v4 |
https://arxiv.org/pdf/1905.01213v4.pdf | |
PWC | https://paperswithcode.com/paper/matlab-vs-opencv-a-comparative-study-of |
Repo | |
Framework | |
Exploiting video sequences for unsupervised disentangling in generative adversarial networks
Title | Exploiting video sequences for unsupervised disentangling in generative adversarial networks |
Authors | Facundo Tuesca, Lucas C. Uzal |
Abstract | In this work we present an adversarial training algorithm that exploits correlations in video to learn –without supervision– an image generator model with a disentangled latent space. The proposed methodology requires only a few modifications to the standard algorithm of Generative Adversarial Networks (GAN) and involves training with sets of frames taken from short videos. We train our model over two datasets of face-centered videos which present different people speaking or moving the head: VidTIMIT and YouTube Faces datasets. We found that our proposal allows us to split the generator latent space into two subspaces. One of them controls content attributes, those that do not change along short video sequences. For the considered datasets, this is the identity of the generated face. The other subspace controls motion attributes, those attributes that are observed to change along short videos. We observed that these motion attributes are face expressions, head orientation, lips and eyes movement. The presented experiments provide quantitative and qualitative evidence supporting that the proposed methodology induces a disentangling of this two kinds of attributes in the latent space. |
Tasks | |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.11104v1 |
https://arxiv.org/pdf/1910.11104v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-video-sequences-for-unsupervised |
Repo | |
Framework | |
Anomaly scores for generative models
Title | Anomaly scores for generative models |
Authors | Václav Šmídl, Jan Bím, Tomáš Pevný |
Abstract | Reconstruction error is a prevalent score used to identify anomalous samples when data are modeled by generative models, such as (variational) auto-encoders or generative adversarial networks. This score relies on the assumption that normal samples are located on a manifold and all anomalous samples are located outside. Since the manifold can be learned only where the training data lie, there are no guarantees how the reconstruction error behaves elsewhere and the score, therefore, seems to be ill-defined. This work defines an anomaly score that is theoretically compatible with generative models, and very natural for (variational) auto-encoders as they seem to be prevalent. The new score can be also used to select hyper-parameters and models. Finally, we explain why reconstruction error delivers good experimental results despite weak theoretical justification. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11890v1 |
https://arxiv.org/pdf/1905.11890v1.pdf | |
PWC | https://paperswithcode.com/paper/anomaly-scores-for-generative-models |
Repo | |
Framework | |
Collaborative Machine Learning at the Wireless Edge with Blind Transmitters
Title | Collaborative Machine Learning at the Wireless Edge with Blind Transmitters |
Authors | Mohammad Mohammadi Amiri, Tolga M. Duman, Deniz Gunduz |
Abstract | We study wireless collaborative machine learning (ML), where mobile edge devices, each with its own dataset, carry out distributed stochastic gradient descent (DSGD) over-the-air with the help of a wireless access point acting as the parameter server (PS). At each iteration of the DSGD algorithm wireless devices compute gradient estimates with their local datasets, and send them to the PS over a wireless fading multiple access channel (MAC). Motivated by the additive nature of the wireless MAC, we propose an analog DSGD scheme, in which the devices transmit scaled versions of their gradient estimates in an uncoded fashion. We assume that the channel state information (CSI) is available only at the PS. We instead allow the PS to employ multiple antennas to alleviate the destructive fading effect, which cannot be cancelled by the transmitters due to the lack of CSI. Theoretical analysis indicates that, with the proposed DSGD scheme, increasing the number of PS antennas mitigates the fading effect, and, in the limit, the effects of fading and noise disappear, and the PS receives aligned signals used to update the model parameter. The theoretical results are then corroborated with the experimental ones. |
Tasks | |
Published | 2019-07-08 |
URL | https://arxiv.org/abs/1907.03909v1 |
https://arxiv.org/pdf/1907.03909v1.pdf | |
PWC | https://paperswithcode.com/paper/collaborative-machine-learning-at-the |
Repo | |
Framework | |
U2-Net: A Bayesian U-Net model with epistemic uncertainty feedback for photoreceptor layer segmentation in pathological OCT scans
Title | U2-Net: A Bayesian U-Net model with epistemic uncertainty feedback for photoreceptor layer segmentation in pathological OCT scans |
Authors | José Ignacio Orlando, Philipp Seeböck, Hrvoje Bogunović, Sophie Klimscha, Christoph Grechenig, Sebastian Waldstein, Bianca S. Gerendas, Ursula Schmidt-Erfurth |
Abstract | In this paper, we introduce a Bayesian deep learning based model for segmenting the photoreceptor layer in pathological OCT scans. Our architecture provides accurate segmentations of the photoreceptor layer and produces pixel-wise epistemic uncertainty maps that highlight potential areas of pathologies or segmentation errors. We empirically evaluated this approach in two sets of pathological OCT scans of patients with age-related macular degeneration, retinal vein oclussion and diabetic macular edema, improving the performance of the baseline U-Net both in terms of the Dice index and the area under the precision/recall curve. We also observed that the uncertainty estimates were inversely correlated with the model performance, underlying its utility for highlighting areas where manual inspection/correction might be needed. |
Tasks | |
Published | 2019-01-23 |
URL | https://arxiv.org/abs/1901.07929v2 |
https://arxiv.org/pdf/1901.07929v2.pdf | |
PWC | https://paperswithcode.com/paper/u2-net-a-bayesian-u-net-model-with-epistemic |
Repo | |
Framework | |
DeVLearn: A Deep Visual Learning Framework for Localizing Temporary Faults in Power Systems
Title | DeVLearn: A Deep Visual Learning Framework for Localizing Temporary Faults in Power Systems |
Authors | Shuchismita Biswas, Rounak Meyur, Virgilio Centeno |
Abstract | Frequently recurring transient faults in a transmission network may be indicative of impending permanent failures. Hence, determining their location is a critical task. This paper proposes a novel image embedding aided deep learning framework called DeVLearn for faulted line location using PMU measurements at generator buses. Inspired by breakthroughs in computer vision, DeVLearn represents measurements (one-dimensional time series data) as two-dimensional unthresholded Recurrent Plot (RP) images. These RP images preserve the temporal relationships present in the original time series and are used to train a deep Variational Auto-Encoder (VAE). The VAE learns the distribution of latent features in the images. Our results show that for faults on two different lines in the IEEE 68-bus network, DeVLearn is able to project PMU measurements into a two-dimensional space such that data for faults at different locations separate into well-defined clusters. This compressed representation may then be used with off-the-shelf classifiers for determining fault location. The efficacy of the proposed framework is demonstrated using local voltage magnitude measurements at two generator buses. |
Tasks | Time Series |
Published | 2019-11-09 |
URL | https://arxiv.org/abs/1911.03759v1 |
https://arxiv.org/pdf/1911.03759v1.pdf | |
PWC | https://paperswithcode.com/paper/devlearn-a-deep-visual-learning-framework-for |
Repo | |
Framework | |
From Clustering to Cluster Explanations via Neural Networks
Title | From Clustering to Cluster Explanations via Neural Networks |
Authors | Jacob Kauffmann, Malte Esders, Grégoire Montavon, Wojciech Samek, Klaus-Robert Müller |
Abstract | A wealth of algorithms have been developed to extract natural cluster structure in data. Identifying this structure is desirable but not always sufficient: We may also want to understand why the data points have been assigned to a given cluster. Clustering algorithms do not offer a systematic answer to this simple question. Hence we propose a new framework that can, for the first time, explain cluster assignments in terms of input features in a comprehensive manner. It is based on the novel theoretical insight that clustering models can be rewritten as neural networks, or ‘neuralized’. Predictions of the obtained networks can then be quickly and accurately attributed to the input features. Several showcases demonstrate the ability of our method to assess the quality of learned clusters and to extract novel insights from the analyzed data and representations. |
Tasks | |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07633v1 |
https://arxiv.org/pdf/1906.07633v1.pdf | |
PWC | https://paperswithcode.com/paper/from-clustering-to-cluster-explanations-via |
Repo | |
Framework | |
PODNet: A Neural Network for Discovery of Plannable Options
Title | PODNet: A Neural Network for Discovery of Plannable Options |
Authors | Ritwik Bera, Vinicius G. Goecks, Gregory M. Gremillion, John Valasek, Nicholas R. Waytowich |
Abstract | Learning from demonstration has been widely studied in machine learning but becomes challenging when the demonstrated trajectories are unstructured and follow different objectives. This short-paper proposes PODNet, Plannable Option Discovery Network, addressing how to segment an unstructured set of demonstrated trajectories for option discovery. This enables learning from demonstration to perform multiple tasks and plan high-level trajectories based on the discovered option labels. PODNet combines a custom categorical variational autoencoder, a recurrent option inference network, option-conditioned policy network, and option dynamics model in an end-to-end learning architecture. Due to the concurrently trained option-conditioned policy network and option dynamics model, the proposed architecture has implications in multi-task and hierarchical learning, explainable and interpretable artificial intelligence, and applications where the agent is required to learn only from observations. |
Tasks | |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00171v3 |
https://arxiv.org/pdf/1911.00171v3.pdf | |
PWC | https://paperswithcode.com/paper/podnet-a-neural-network-for-discovery-of |
Repo | |
Framework | |
Max-Sliced Wasserstein Distance and its use for GANs
Title | Max-Sliced Wasserstein Distance and its use for GANs |
Authors | Ishan Deshpande, Yuan-Ting Hu, Ruoyu Sun, Ayis Pyrros, Nasir Siddiqui, Sanmi Koyejo, Zhizhen Zhao, David Forsyth, Alexander Schwing |
Abstract | Generative adversarial nets (GANs) and variational auto-encoders have significantly improved our distribution modeling capabilities, showing promise for dataset augmentation, image-to-image translation and feature learning. However, to model high-dimensional distributions, sequential training and stacked architectures are common, increasing the number of tunable hyper-parameters as well as the training time. Nonetheless, the sample complexity of the distance metrics remains one of the factors affecting GAN training. We first show that the recently proposed sliced Wasserstein distance has compelling sample complexity properties when compared to the Wasserstein distance. To further improve the sliced Wasserstein distance we then analyze its `projection complexity’ and develop the max-sliced Wasserstein distance which enjoys compelling sample complexity while reducing projection complexity, albeit necessitating a max estimation. We finally illustrate that the proposed distance trains GANs on high-dimensional images up to a resolution of 256x256 easily. | |
Tasks | Image-to-Image Translation |
Published | 2019-04-11 |
URL | http://arxiv.org/abs/1904.05877v1 |
http://arxiv.org/pdf/1904.05877v1.pdf | |
PWC | https://paperswithcode.com/paper/max-sliced-wasserstein-distance-and-its-use |
Repo | |
Framework | |