Paper Group ANR 633
Spatial features of synaptic adaptation affecting learning performance. Parallel Multiscale Autoregressive Density Estimation. A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds. Review of Machine Learning Algorithms in Differential Expression Analysis. m-TSNE: A Framework for Visualizin …
Spatial features of synaptic adaptation affecting learning performance
Title | Spatial features of synaptic adaptation affecting learning performance |
Authors | Damian L. Berger, Lucilla de Arcangelis, Hans J. Herrmann |
Abstract | Recent studies have proposed that the diffusion of messenger molecules, such as monoamines, can mediate the plastic adaptation of synapses in supervised learning of neural networks. Based on these findings we developed a model for neural learning, where the signal for plastic adaptation is assumed to propagate through the extracellular space. We investigate the conditions allowing learning of Boolean rules in a neural network. Even fully excitatory networks show very good learning performances. Moreover, the investigation of the plastic adaptation features optimizing the performance suggests that learning is very sensitive to the extent of the plastic adaptation and the spatial range of synaptic connections. |
Tasks | |
Published | 2017-09-20 |
URL | http://arxiv.org/abs/1709.06950v1 |
http://arxiv.org/pdf/1709.06950v1.pdf | |
PWC | https://paperswithcode.com/paper/spatial-features-of-synaptic-adaptation |
Repo | |
Framework | |
Parallel Multiscale Autoregressive Density Estimation
Title | Parallel Multiscale Autoregressive Density Estimation |
Authors | Scott Reed, Aäron van den Oord, Nal Kalchbrenner, Sergio Gómez Colmenarejo, Ziyu Wang, Dan Belov, Nando de Freitas |
Abstract | PixelCNN achieves state-of-the-art results in density estimation for natural images. Although training is fast, inference is costly, requiring one network evaluation per pixel; O(N) for N pixels. This can be sped up by caching activations, but still involves generating each pixel sequentially. In this work, we propose a parallelized PixelCNN that allows more efficient inference by modeling certain pixel groups as conditionally independent. Our new PixelCNN model achieves competitive density estimation and orders of magnitude speedup - O(log N) sampling instead of O(N) - enabling the practical generation of 512x512 images. We evaluate the model on class-conditional image generation, text-to-image synthesis, and action-conditional video generation, showing that our model achieves the best results among non-pixel-autoregressive density models that allow efficient sampling. |
Tasks | Conditional Image Generation, Density Estimation, Image Generation, Video Generation |
Published | 2017-03-10 |
URL | http://arxiv.org/abs/1703.03664v1 |
http://arxiv.org/pdf/1703.03664v1.pdf | |
PWC | https://paperswithcode.com/paper/parallel-multiscale-autoregressive-density |
Repo | |
Framework | |
A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds
Title | A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds |
Authors | Pooria Joulani, András György, Csaba Szepesvári |
Abstract | Recently, much work has been done on extending the scope of online learning and incremental stochastic optimization algorithms. In this paper we contribute to this effort in two ways: First, based on a new regret decomposition and a generalization of Bregman divergences, we provide a self-contained, modular analysis of the two workhorses of online learning: (general) adaptive versions of Mirror Descent (MD) and the Follow-the-Regularized-Leader (FTRL) algorithms. The analysis is done with extra care so as not to introduce assumptions not needed in the proofs and allows to combine, in a straightforward way, different algorithmic ideas (e.g., adaptivity, optimism, implicit updates) and learning settings (e.g., strongly convex or composite objectives). This way we are able to reprove, extend and refine a large body of the literature, while keeping the proofs concise. The second contribution is a byproduct of this careful analysis: We present algorithms with improved variational bounds for smooth, composite objectives, including a new family of optimistic MD algorithms with only one projection step per round. Furthermore, we provide a simple extension of adaptive regret bounds to practically relevant non-convex problem settings with essentially no extra effort. |
Tasks | Stochastic Optimization |
Published | 2017-09-08 |
URL | http://arxiv.org/abs/1709.02726v1 |
http://arxiv.org/pdf/1709.02726v1.pdf | |
PWC | https://paperswithcode.com/paper/a-modular-analysis-of-adaptive-non-convex |
Repo | |
Framework | |
Review of Machine Learning Algorithms in Differential Expression Analysis
Title | Review of Machine Learning Algorithms in Differential Expression Analysis |
Authors | Irina Kuznetsova, Yuliya V Karpievitch, Aleksandra Filipovska, Artur Lugmayr, Andreas Holzinger |
Abstract | In biological research machine learning algorithms are part of nearly every analytical process. They are used to identify new insights into biological phenomena, interpret data, provide molecular diagnosis for diseases and develop personalized medicine that will enable future treatments of diseases. In this paper we (1) illustrate the importance of machine learning in the analysis of large scale sequencing data, (2) present an illustrative standardized workflow of the analysis process, (3) perform a Differential Expression (DE) analysis of a publicly available RNA sequencing (RNASeq) data set to demonstrate the capabilities of various algorithms at each step of the workflow, and (4) show a machine learning solution in improving the computing time, storage requirements, and minimize utilization of computer memory in analyses of RNA-Seq datasets. The source code of the analysis pipeline and associated scripts are presented in the paper appendix to allow replication of experiments. |
Tasks | |
Published | 2017-07-28 |
URL | http://arxiv.org/abs/1707.09837v1 |
http://arxiv.org/pdf/1707.09837v1.pdf | |
PWC | https://paperswithcode.com/paper/review-of-machine-learning-algorithms-in |
Repo | |
Framework | |
m-TSNE: A Framework for Visualizing High-Dimensional Multivariate Time Series
Title | m-TSNE: A Framework for Visualizing High-Dimensional Multivariate Time Series |
Authors | Minh Nguyen, Sanjay Purushotham, Hien To, Cyrus Shahabi |
Abstract | Multivariate time series (MTS) have become increasingly common in healthcare domains where human vital signs and laboratory results are collected for predictive diagnosis. Recently, there have been increasing efforts to visualize healthcare MTS data based on star charts or parallel coordinates. However, such techniques might not be ideal for visualizing a large MTS dataset, since it is difficult to obtain insights or interpretations due to the inherent high dimensionality of MTS. In this paper, we propose ‘m-TSNE’: a simple and novel framework to visualize high-dimensional MTS data by projecting them into a low-dimensional (2-D or 3-D) space while capturing the underlying data properties. Our framework is easy to use and provides interpretable insights for healthcare professionals to understand MTS data. We evaluate our visualization framework on two real-world datasets and demonstrate that the results of our m-TSNE show patterns that are easy to understand while the other methods’ visualization may have limitations in interpretability. |
Tasks | Time Series |
Published | 2017-08-26 |
URL | http://arxiv.org/abs/1708.07942v1 |
http://arxiv.org/pdf/1708.07942v1.pdf | |
PWC | https://paperswithcode.com/paper/m-tsne-a-framework-for-visualizing-high |
Repo | |
Framework | |
When is Network Lasso Accurate: The Vector Case
Title | When is Network Lasso Accurate: The Vector Case |
Authors | Nguyen Tran, Saeed Basirian, Alexander Jung |
Abstract | A recently proposed learning algorithm for massive network-structured data sets (big data over networks) is the network Lasso (nLasso), which extends the well- known Lasso estimator from sparse models to network-structured datasets. Efficient implementations of the nLasso have been presented using modern convex optimization methods. In this paper, we provide sufficient conditions on the network structure and available label information such that nLasso accurately learns a vector-valued graph signal (representing label information) from the information provided by the labels of a few data points. |
Tasks | |
Published | 2017-10-11 |
URL | http://arxiv.org/abs/1710.03942v1 |
http://arxiv.org/pdf/1710.03942v1.pdf | |
PWC | https://paperswithcode.com/paper/when-is-network-lasso-accurate-the-vector |
Repo | |
Framework | |
Black-Box Optimization in Machine Learning with Trust Region Based Derivative Free Algorithm
Title | Black-Box Optimization in Machine Learning with Trust Region Based Derivative Free Algorithm |
Authors | Hiva Ghanbari, Katya Scheinberg |
Abstract | In this work, we utilize a Trust Region based Derivative Free Optimization (DFO-TR) method to directly maximize the Area Under Receiver Operating Characteristic Curve (AUC), which is a nonsmooth, noisy function. We show that AUC is a smooth function, in expectation, if the distributions of the positive and negative data points obey a jointly normal distribution. The practical performance of this algorithm is compared to three prominent Bayesian optimization methods and random search. The presented numerical results show that DFO-TR surpasses Bayesian optimization and random search on various black-box optimization problem, such as maximizing AUC and hyperparameter tuning. |
Tasks | Text-to-Image Generation |
Published | 2017-03-20 |
URL | http://arxiv.org/abs/1703.06925v1 |
http://arxiv.org/pdf/1703.06925v1.pdf | |
PWC | https://paperswithcode.com/paper/black-box-optimization-in-machine-learning |
Repo | |
Framework | |
Data Clustering using a Hybrid of Fuzzy C-Means and Quantum-behaved Particle Swarm Optimization
Title | Data Clustering using a Hybrid of Fuzzy C-Means and Quantum-behaved Particle Swarm Optimization |
Authors | Saptarshi Sengupta, Sanchita Basak, Richard Alan Peters II |
Abstract | Fuzzy clustering has become a widely used data mining technique and plays an important role in grouping, traversing and selectively using data for user specified applications. The deterministic Fuzzy C-Means (FCM) algorithm may result in suboptimal solutions when applied to multidimensional data in real-world, time-constrained problems. In this paper the Quantum-behaved Particle Swarm Optimization (QPSO) with a fully connected topology is coupled with the Fuzzy C-Means Clustering algorithm and is tested on a suite of datasets from the UCI Machine Learning Repository. The global search ability of the QPSO algorithm helps in avoiding stagnation in local optima while the soft clustering approach of FCM helps to partition data based on membership probabilities. Clustering performance indices such as F-Measure, Accuracy, Quantization Error, Intercluster and Intracluster distances are reported for competitive techniques such as PSO K-Means, QPSO K-Means and QPSO FCM over all datasets considered. Experimental results indicate that QPSO FCM provides comparable and in most cases superior results when compared to the others. |
Tasks | Quantization |
Published | 2017-12-15 |
URL | http://arxiv.org/abs/1712.05512v1 |
http://arxiv.org/pdf/1712.05512v1.pdf | |
PWC | https://paperswithcode.com/paper/data-clustering-using-a-hybrid-of-fuzzy-c |
Repo | |
Framework | |
Mixed Membership Word Embeddings for Computational Social Science
Title | Mixed Membership Word Embeddings for Computational Social Science |
Authors | James Foulds |
Abstract | Word embeddings improve the performance of NLP systems by revealing the hidden structural relationships between words. Despite their success in many applications, word embeddings have seen very little use in computational social science NLP tasks, presumably due to their reliance on big data, and to a lack of interpretability. I propose a probabilistic model-based word embedding method which can recover interpretable embeddings, without big data. The key insight is to leverage mixed membership modeling, in which global representations are shared, but individual entities (i.e. dictionary words) are free to use these representations to uniquely differing degrees. I show how to train the model using a combination of state-of-the-art training techniques for word embeddings and topic models. The experimental results show an improvement in predictive language modeling of up to 63% in MRR over the skip-gram, and demonstrate that the representations are beneficial for supervised learning. I illustrate the interpretability of the models with computational social science case studies on State of the Union addresses and NIPS articles. |
Tasks | Language Modelling, Topic Models, Word Embeddings |
Published | 2017-05-20 |
URL | http://arxiv.org/abs/1705.07368v3 |
http://arxiv.org/pdf/1705.07368v3.pdf | |
PWC | https://paperswithcode.com/paper/mixed-membership-word-embeddings-for |
Repo | |
Framework | |
Convex Coupled Matrix and Tensor Completion
Title | Convex Coupled Matrix and Tensor Completion |
Authors | Kishan Wimalawarne, Makoto Yamada, Hiroshi Mamitsuka |
Abstract | We propose a set of convex low rank inducing norms for a coupled matrices and tensors (hereafter coupled tensors), which shares information between matrices and tensors through common modes. More specifically, we propose a mixture of the overlapped trace norm and the latent norms with the matrix trace norm, and then, we propose a new completion algorithm based on the proposed norms. A key advantage of the proposed norms is that it is convex and can find a globally optimal solution, while existing methods for coupled learning are non-convex. Furthermore, we analyze the excess risk bounds of the completion model regularized by our proposed norms which show that our proposed norms can exploit the low rankness of coupled tensors leading to better bounds compared to uncoupled norms. Through synthetic and real-world data experiments, we show that the proposed completion algorithm compares favorably with existing completion algorithms. |
Tasks | |
Published | 2017-05-15 |
URL | http://arxiv.org/abs/1705.05197v2 |
http://arxiv.org/pdf/1705.05197v2.pdf | |
PWC | https://paperswithcode.com/paper/convex-coupled-matrix-and-tensor-completion |
Repo | |
Framework | |
An Empirical Approach for Modeling Fuzzy Geographical Descriptors
Title | An Empirical Approach for Modeling Fuzzy Geographical Descriptors |
Authors | Alejandro Ramos-Soto, Jose M. Alonso, Ehud Reiter, Kees van Deemter, Albert Gatt |
Abstract | We present a novel heuristic approach that defines fuzzy geographical descriptors using data gathered from a survey with human subjects. The participants were asked to provide graphical interpretations of the descriptors north' and south’ for the Galician region (Spain). Based on these interpretations, our approach builds fuzzy descriptors that are able to compute membership degrees for geographical locations. We evaluated our approach in terms of efficiency and precision. The fuzzy descriptors are meant to be used as the cornerstones of a geographical referring expression generation algorithm that is able to linguistically characterize geographical locations and regions. This work is also part of a general research effort that intends to establish a methodology which reunites the empirical studies traditionally practiced in data-to-text and the use of fuzzy sets to model imprecision and vagueness in words and expressions for text generation purposes. |
Tasks | Text Generation |
Published | 2017-03-30 |
URL | http://arxiv.org/abs/1703.10429v1 |
http://arxiv.org/pdf/1703.10429v1.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-approach-for-modeling-fuzzy |
Repo | |
Framework | |
How Does Knowledge of the AUC Constrain the Set of Possible Ground-truth Labelings?
Title | How Does Knowledge of the AUC Constrain the Set of Possible Ground-truth Labelings? |
Authors | Jacob Whitehill |
Abstract | Recent work on privacy-preserving machine learning has considered how data-mining competitions such as Kaggle could potentially be “hacked”, either intentionally or inadvertently, by using information from an oracle that reports a classifier’s accuracy on the test set. For binary classification tasks in particular, one of the most common accuracy metrics is the Area Under the ROC Curve (AUC), and in this paper we explore the mathematical structure of how the AUC is computed from an n-vector of real-valued “guesses” with respect to the ground-truth labels. We show how knowledge of a classifier’s AUC on the test set can constrain the set of possible ground-truth labelings, and we derive an algorithm both to compute the exact number of such labelings and to enumerate efficiently over them. Finally, we provide empirical evidence that, surprisingly, the number of compatible labelings can actually decrease as n grows, until a test set-dependent threshold is reached. |
Tasks | Accuracy Metrics |
Published | 2017-09-07 |
URL | http://arxiv.org/abs/1709.02418v2 |
http://arxiv.org/pdf/1709.02418v2.pdf | |
PWC | https://paperswithcode.com/paper/how-does-knowledge-of-the-auc-constrain-the |
Repo | |
Framework | |
Improved Speech Reconstruction from Silent Video
Title | Improved Speech Reconstruction from Silent Video |
Authors | Ariel Ephrat, Tavi Halperin, Shmuel Peleg |
Abstract | Speechreading is the task of inferring phonetic information from visually observed articulatory facial movements, and is a notoriously difficult task for humans to perform. In this paper we present an end-to-end model based on a convolutional neural network (CNN) for generating an intelligible and natural-sounding acoustic speech signal from silent video frames of a speaking person. We train our model on speakers from the GRID and TCD-TIMIT datasets, and evaluate the quality and intelligibility of reconstructed speech using common objective measurements. We show that speech predictions from the proposed model attain scores which indicate significantly improved quality over existing models. In addition, we show promising results towards reconstructing speech from an unconstrained dictionary. |
Tasks | |
Published | 2017-08-01 |
URL | http://arxiv.org/abs/1708.01204v3 |
http://arxiv.org/pdf/1708.01204v3.pdf | |
PWC | https://paperswithcode.com/paper/improved-speech-reconstruction-from-silent |
Repo | |
Framework | |
Learning Hidden Quantum Markov Models
Title | Learning Hidden Quantum Markov Models |
Authors | Siddarth Srinivasan, Geoff Gordon, Byron Boots |
Abstract | Hidden Quantum Markov Models (HQMMs) can be thought of as quantum probabilistic graphical models that can model sequential data. We extend previous work on HQMMs with three contributions: (1) we show how classical hidden Markov models (HMMs) can be simulated on a quantum circuit, (2) we reformulate HQMMs by relaxing the constraints for modeling HMMs on quantum circuits, and (3) we present a learning algorithm to estimate the parameters of an HQMM from data. While our algorithm requires further optimization to handle larger datasets, we are able to evaluate our algorithm using several synthetic datasets. We show that on HQMM generated data, our algorithm learns HQMMs with the same number of hidden states and predictive accuracy as the true HQMMs, while HMMs learned with the Baum-Welch algorithm require more states to match the predictive accuracy. |
Tasks | |
Published | 2017-10-24 |
URL | http://arxiv.org/abs/1710.09016v1 |
http://arxiv.org/pdf/1710.09016v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-hidden-quantum-markov-models |
Repo | |
Framework | |
A Dynamic Edge Exchangeable Model for Sparse Temporal Networks
Title | A Dynamic Edge Exchangeable Model for Sparse Temporal Networks |
Authors | Yin Cheng Ng, Ricardo Silva |
Abstract | We propose a dynamic edge exchangeable network model that can capture sparse connections observed in real temporal networks, in contrast to existing models which are dense. The model achieved superior link prediction accuracy on multiple data sets when compared to a dynamic variant of the blockmodel, and is able to extract interpretable time-varying community structures from the data. In addition to sparsity, the model accounts for the effect of social influence on vertices’ future behaviours. Compared to the dynamic blockmodels, our model has a smaller latent space. The compact latent space requires a smaller number of parameters to be estimated in variational inference and results in a computationally friendly inference algorithm. |
Tasks | Link Prediction |
Published | 2017-10-11 |
URL | http://arxiv.org/abs/1710.04008v1 |
http://arxiv.org/pdf/1710.04008v1.pdf | |
PWC | https://paperswithcode.com/paper/a-dynamic-edge-exchangeable-model-for-sparse |
Repo | |
Framework | |