Paper Group ANR 105
Capturing the diversity of biological tuning curves using generative adversarial networks. A Formal Characterization of the Local Search Topology of the Gap Heuristic. Automatic Selection of t-SNE Perplexity. Unsupervised Learning for Color Constancy. Adaptive Estimation in Structured Factor Models with Applications to Overlapping Clustering. The I …
Capturing the diversity of biological tuning curves using generative adversarial networks
Title | Capturing the diversity of biological tuning curves using generative adversarial networks |
Authors | Takafumi Arakaki, G. Barello, Yashar Ahmadian |
Abstract | Tuning curves characterizing the response selectivities of biological neurons often exhibit large degrees of irregularity and diversity across neurons. Theoretical network models that feature heterogeneous cell populations or random connectivity also give rise to diverse tuning curves. However, a general framework for fitting such models to experimentally measured tuning curves is lacking. We address this problem by proposing to view mechanistic network models as generative models whose parameters can be optimized to fit the distribution of experimentally measured tuning curves. A major obstacle for fitting such models is that their likelihood function is not explicitly available or is highly intractable to compute. Recent advances in machine learning provide ways for fitting generative models without the need to evaluate the likelihood and its gradient. Generative Adversarial Networks (GAN) provide one such framework which has been successful in traditional machine learning tasks. We apply this approach in two separate experiments, showing how GANs can be used to fit commonly used mechanistic models in theoretical neuroscience to datasets of measured tuning curves. This fitting procedure avoids the computationally expensive step of inferring latent variables, e.g. the biophysical parameters of individual cells or the particular realization of the full synaptic connectivity matrix, and directly learns model parameters which characterize the statistics of connectivity or of single-cell properties. Another strength of this approach is that it fits the entire, joint distribution of experimental tuning curves, instead of matching a few summary statistics picked a priori by the user. More generally, this framework opens the door to fitting theoretically motivated dynamical network models directly to simultaneously or non-simultaneously recorded neural responses. |
Tasks | |
Published | 2017-07-14 |
URL | http://arxiv.org/abs/1707.04582v3 |
http://arxiv.org/pdf/1707.04582v3.pdf | |
PWC | https://paperswithcode.com/paper/capturing-the-diversity-of-biological-tuning |
Repo | |
Framework | |
A Formal Characterization of the Local Search Topology of the Gap Heuristic
Title | A Formal Characterization of the Local Search Topology of the Gap Heuristic |
Authors | Richard Anthony Valenzano, Danniel Sihui Yang |
Abstract | The pancake puzzle is a classic optimization problem that has become a standard benchmark for heuristic search algorithms. In this paper, we provide full proofs regarding the local search topology of the gap heuristic for the pancake puzzle. First, we show that in any non-goal state in which there is no move that will decrease the number of gaps, there is a move that will keep the number of gaps constant. We then classify any state in which the number of gaps cannot be decreased in a single action into two groups: those requiring 2 actions to decrease the number of gaps, and those which require 3 actions to decrease the number of gaps. |
Tasks | |
Published | 2017-05-12 |
URL | http://arxiv.org/abs/1705.04665v1 |
http://arxiv.org/pdf/1705.04665v1.pdf | |
PWC | https://paperswithcode.com/paper/a-formal-characterization-of-the-local-search |
Repo | |
Framework | |
Automatic Selection of t-SNE Perplexity
Title | Automatic Selection of t-SNE Perplexity |
Authors | Yanshuai Cao, Luyu Wang |
Abstract | t-Distributed Stochastic Neighbor Embedding (t-SNE) is one of the most widely used dimensionality reduction methods for data visualization, but it has a perplexity hyperparameter that requires manual selection. In practice, proper tuning of t-SNE perplexity requires users to understand the inner working of the method as well as to have hands-on experience. We propose a model selection objective for t-SNE perplexity that requires negligible extra computation beyond that of the t-SNE itself. We empirically validate that the perplexity settings found by our approach are consistent with preferences elicited from human experts across a number of datasets. The similarities of our approach to Bayesian information criteria (BIC) and minimum description length (MDL) are also analyzed. |
Tasks | Dimensionality Reduction, Model Selection |
Published | 2017-08-10 |
URL | http://arxiv.org/abs/1708.03229v1 |
http://arxiv.org/pdf/1708.03229v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-selection-of-t-sne-perplexity |
Repo | |
Framework | |
Unsupervised Learning for Color Constancy
Title | Unsupervised Learning for Color Constancy |
Authors | Nikola Banić, Karlo Koščević, Sven Lončarić |
Abstract | Most digital camera pipelines use color constancy methods to reduce the influence of illumination and camera sensor on the colors of scene objects. The highest accuracy of color correction is obtained with learning-based color constancy methods, but they require a significant amount of calibrated training images with known ground-truth illumination. Such calibration is time consuming, preferably done for each sensor individually, and therefore a major bottleneck in acquiring high color constancy accuracy. Statistics-based methods do not require calibrated training images, but they are less accurate. In this paper an unsupervised learning-based method is proposed that learns its parameter values after approximating the unknown ground-truth illumination of the training images, thus avoiding calibration. In terms of accuracy the proposed method outperforms all statistics-based and many learning-based methods. An extension of the method is also proposed, which learns the needed parameters from non-calibrated images taken with one sensor and which can then be successfully applied to images taken with another sensor. This effectively enables inter-camera unsupervised learning for color constancy. Additionally, a new high quality color constancy benchmark dataset with 1707 calibrated images is created, used for testing, and made publicly available. The results are presented and discussed. The source code and the dataset are available at http://www.fer.unizg.hr/ipg/resources/color_constancy/. |
Tasks | Calibration, Color Constancy |
Published | 2017-12-01 |
URL | http://arxiv.org/abs/1712.00436v4 |
http://arxiv.org/pdf/1712.00436v4.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-for-color-constancy |
Repo | |
Framework | |
Adaptive Estimation in Structured Factor Models with Applications to Overlapping Clustering
Title | Adaptive Estimation in Structured Factor Models with Applications to Overlapping Clustering |
Authors | Xin Bing, Florentina Bunea, Yang Ning, Marten Wegkamp |
Abstract | This work introduces a novel estimation method, called LOVE, of the entries and structure of a loading matrix A in a sparse latent factor model X = AZ + E, for an observable random vector X in Rp, with correlated unobservable factors Z \in RK, with K unknown, and independent noise E. Each row of A is scaled and sparse. In order to identify the loading matrix A, we require the existence of pure variables, which are components of X that are associated, via A, with one and only one latent factor. Despite the fact that the number of factors K, the number of the pure variables, and their location are all unknown, we only require a mild condition on the covariance matrix of Z, and a minimum of only two pure variables per latent factor to show that A is uniquely defined, up to signed permutations. Our proofs for model identifiability are constructive, and lead to our novel estimation method of the number of factors and of the set of pure variables, from a sample of size n of observations on X. This is the first step of our LOVE algorithm, which is optimization-free, and has low computational complexity of order p2. The second step of LOVE is an easily implementable linear program that estimates A. We prove that the resulting estimator is minimax rate optimal up to logarithmic factors in p. The model structure is motivated by the problem of overlapping variable clustering, ubiquitous in data science. We define the population level clusters as groups of those components of X that are associated, via the sparse matrix A, with the same unobservable latent factor, and multi-factor association is allowed. Clusters are respectively anchored by the pure variables, and form overlapping sub-groups of the p-dimensional random vector X. The Latent model approach to OVErlapping clustering is reflected in the name of our algorithm, LOVE. |
Tasks | |
Published | 2017-04-23 |
URL | https://arxiv.org/abs/1704.06977v4 |
https://arxiv.org/pdf/1704.06977v4.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-estimation-in-structured-factor |
Repo | |
Framework | |
The Influence of Feature Representation of Text on the Performance of Document Classification
Title | The Influence of Feature Representation of Text on the Performance of Document Classification |
Authors | Sanda Martinčić-Ipšić, Tanja Miličić, Ljupčo Todorovski |
Abstract | In this paper we perform a comparative analysis of three models for feature representation of text documents in the context of document classification. In particular, we consider the most often used family of models bag-of-words, recently proposed continuous space models word2vec and doc2vec, and the model based on the representation of text documents as language networks. While the bag-of-word models have been extensively used for the document classification task, the performance of the other two models for the same task have not been well understood. This is especially true for the network-based model that have been rarely considered for representation of text documents for classification. In this study, we measure the performance of the document classifiers trained using the method of random forests for features generated the three models and their variants. The results of the empirical comparison show that the commonly used bag-of-words model has performance comparable to the one obtained by the emerging continuous-space model of doc2vec. In particular, the low-dimensional variants of doc2vec generating up to 75 features are among the top-performing document representation models. The results finally point out that doc2vec shows a superior performance in the tasks of classifying large documents. |
Tasks | Document Classification |
Published | 2017-07-05 |
URL | http://arxiv.org/abs/1707.01321v1 |
http://arxiv.org/pdf/1707.01321v1.pdf | |
PWC | https://paperswithcode.com/paper/the-influence-of-feature-representation-of |
Repo | |
Framework | |
A dataset for Computer-Aided Detection of Pulmonary Embolism in CTA images
Title | A dataset for Computer-Aided Detection of Pulmonary Embolism in CTA images |
Authors | Mojtaba Masoudi, Hamidreza Pourreza, Mahdi Saadatmand Tarzjan, Fateme Shafiee Zargar, Masoud Pezeshki Rad, Noushin Eftekhari |
Abstract | Todays, researchers in the field of Pulmonary Embolism (PE) analysis need to use a publicly available dataset to assess and compare their methods. Different systems have been designed for the detection of pulmonary embolism (PE), but none of them have used any public datasets. All papers have used their own private dataset. In order to fill this gap, we have collected 5160 slices of computed tomography angiography (CTA) images acquired from 20 patients, and after labeling the image by experts in this field, we provided a reliable dataset which is now publicly available. In some situation, PE detection can be difficult, for example when it occurs in the peripheral branches or when patients have pulmonary diseases (such as parenchymal disease). Therefore, the efficiency of CAD systems highly depends on the dataset. In the given dataset, 66% of PE are located in peripheral branches, and different pulmonary diseases are also included. |
Tasks | |
Published | 2017-07-05 |
URL | http://arxiv.org/abs/1707.01330v1 |
http://arxiv.org/pdf/1707.01330v1.pdf | |
PWC | https://paperswithcode.com/paper/a-dataset-for-computer-aided-detection-of |
Repo | |
Framework | |
Wasserstein CNN: Learning Invariant Features for NIR-VIS Face Recognition
Title | Wasserstein CNN: Learning Invariant Features for NIR-VIS Face Recognition |
Authors | Ran He, Xiang Wu, Zhenan Sun, Tieniu Tan |
Abstract | Heterogeneous face recognition (HFR) aims to match facial images acquired from different sensing modalities with mission-critical applications in forensics, security and commercial sectors. However, HFR is a much more challenging problem than traditional face recognition because of large intra-class variations of heterogeneous face images and limited training samples of cross-modality face image pairs. This paper proposes a novel approach namely Wasserstein CNN (convolutional neural networks, or WCNN for short) to learn invariant features between near-infrared and visual face images (i.e. NIR-VIS face recognition). The low-level layers of WCNN are trained with widely available face images in visual spectrum. The high-level layer is divided into three parts, i.e., NIR layer, VIS layer and NIR-VIS shared layer. The first two layers aims to learn modality-specific features and NIR-VIS shared layer is designed to learn modality-invariant feature subspace. Wasserstein distance is introduced into NIR-VIS shared layer to measure the dissimilarity between heterogeneous feature distributions. So W-CNN learning aims to achieve the minimization of Wasserstein distance between NIR distribution and VIS distribution for invariant deep feature representation of heterogeneous face images. To avoid the over-fitting problem on small-scale heterogeneous face data, a correlation prior is introduced on the fully-connected layers of WCNN network to reduce parameter space. This prior is implemented by a low-rank constraint in an end-to-end network. The joint formulation leads to an alternating minimization for deep feature representation at training stage and an efficient computation for heterogeneous data at testing stage. Extensive experiments on three challenging NIR-VIS face recognition databases demonstrate the significant superiority of Wasserstein CNN over state-of-the-art methods. |
Tasks | Face Recognition, Heterogeneous Face Recognition |
Published | 2017-08-08 |
URL | http://arxiv.org/abs/1708.02412v1 |
http://arxiv.org/pdf/1708.02412v1.pdf | |
PWC | https://paperswithcode.com/paper/wasserstein-cnn-learning-invariant-features |
Repo | |
Framework | |
And That’s A Fact: Distinguishing Factual and Emotional Argumentation in Online Dialogue
Title | And That’s A Fact: Distinguishing Factual and Emotional Argumentation in Online Dialogue |
Authors | Shereen Oraby, Lena Reed, Ryan Compton, Ellen Riloff, Marilyn Walker, Steve Whittaker |
Abstract | We investigate the characteristics of factual and emotional argumentation styles observed in online debates. Using an annotated set of “factual” and “feeling” debate forum posts, we extract patterns that are highly correlated with factual and emotional arguments, and then apply a bootstrapping methodology to find new patterns in a larger pool of unannotated forum posts. This process automatically produces a large set of patterns representing linguistic expressions that are highly correlated with factual and emotional language. Finally, we analyze the most discriminating patterns to better understand the defining characteristics of factual and emotional arguments. |
Tasks | |
Published | 2017-09-15 |
URL | http://arxiv.org/abs/1709.05295v1 |
http://arxiv.org/pdf/1709.05295v1.pdf | |
PWC | https://paperswithcode.com/paper/and-thats-a-fact-distinguishing-factual-and-1 |
Repo | |
Framework | |
JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis
Title | JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis |
Authors | Ryosuke Sonobe, Shinnosuke Takamichi, Hiroshi Saruwatari |
Abstract | Thanks to improvements in machine learning techniques including deep learning, a free large-scale speech corpus that can be shared between academic institutions and commercial companies has an important role. However, such a corpus for Japanese speech synthesis does not exist. In this paper, we designed a novel Japanese speech corpus, named the “JSUT corpus,” that is aimed at achieving end-to-end speech synthesis. The corpus consists of 10 hours of reading-style speech data and its transcription and covers all of the main pronunciations of daily-use Japanese characters. In this paper, we describe how we designed and analyzed the corpus. The corpus is freely available online. |
Tasks | Speech Synthesis |
Published | 2017-10-28 |
URL | http://arxiv.org/abs/1711.00354v1 |
http://arxiv.org/pdf/1711.00354v1.pdf | |
PWC | https://paperswithcode.com/paper/jsut-corpus-free-large-scale-japanese-speech |
Repo | |
Framework | |
Targeting Bayes factors with direct-path non-equilibrium thermodynamic integration
Title | Targeting Bayes factors with direct-path non-equilibrium thermodynamic integration |
Authors | Marco Grzegorczyk, Andrej Aderhold, Dirk Husmeier |
Abstract | Thermodynamic integration (TI) for computing marginal likelihoods is based on an inverse annealing path from the prior to the posterior distribution. In many cases, the resulting estimator suffers from high variability, which particularly stems from the prior regime. When comparing complex models with differences in a comparatively small number of parameters, intrinsic errors from sampling fluctuations may outweigh the differences in the log marginal likelihood estimates. In the present article, we propose a thermodynamic integration scheme that directly targets the log Bayes factor. The method is based on a modified annealing path between the posterior distributions of the two models compared, which systematically avoids the high variance prior regime. We combine this scheme with the concept of non-equilibrium TI to minimise discretisation errors from numerical integration. Results obtained on Bayesian regression models applied to standard benchmark data, and a complex hierarchical model applied to biopathway inference, demonstrate a significant reduction in estimator variance over state-of-the-art TI methods. |
Tasks | |
Published | 2017-03-21 |
URL | http://arxiv.org/abs/1703.07305v1 |
http://arxiv.org/pdf/1703.07305v1.pdf | |
PWC | https://paperswithcode.com/paper/targeting-bayes-factors-with-direct-path-non |
Repo | |
Framework | |
Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks
Title | Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks |
Authors | Fenglong Ma, Radha Chitta, Jing Zhou, Quanzeng You, Tong Sun, Jing Gao |
Abstract | Predicting the future health information of patients from the historical Electronic Health Records (EHR) is a core research task in the development of personalized healthcare. Patient EHR data consist of sequences of visits over time, where each visit contains multiple medical codes, including diagnosis, medication, and procedure codes. The most important challenges for this task are to model the temporality and high dimensionality of sequential EHR data and to interpret the prediction results. Existing work solves this problem by employing recurrent neural networks (RNNs) to model EHR data and utilizing simple attention mechanism to interpret the results. However, RNN-based approaches suffer from the problem that the performance of RNNs drops when the length of sequences is large, and the relationships between subsequent visits are ignored by current RNN-based approaches. To address these issues, we propose {\sf Dipole}, an end-to-end, simple and robust model for predicting patients’ future health information. Dipole employs bidirectional recurrent neural networks to remember all the information of both the past visits and the future visits, and it introduces three attention mechanisms to measure the relationships of different visits for the prediction. With the attention mechanisms, Dipole can interpret the prediction results effectively. Dipole also allows us to interpret the learned medical code representations which are confirmed positively by medical experts. Experimental results on two real world EHR datasets show that the proposed Dipole can significantly improve the prediction accuracy compared with the state-of-the-art diagnosis prediction approaches and provide clinically meaningful interpretation. |
Tasks | |
Published | 2017-06-19 |
URL | http://arxiv.org/abs/1706.05764v1 |
http://arxiv.org/pdf/1706.05764v1.pdf | |
PWC | https://paperswithcode.com/paper/dipole-diagnosis-prediction-in-healthcare-via |
Repo | |
Framework | |
Learning what to read: Focused machine reading
Title | Learning what to read: Focused machine reading |
Authors | Enrique Noriega-Atala, Marco A. Valenzuela-Escarcega, Clayton T. Morrison, Mihai Surdeanu |
Abstract | Recent efforts in bioinformatics have achieved tremendous progress in the machine reading of biomedical literature, and the assembly of the extracted biochemical interactions into large-scale models such as protein signaling pathways. However, batch machine reading of literature at today’s scale (PubMed alone indexes over 1 million papers per year) is unfeasible due to both cost and processing overhead. In this work, we introduce a focused reading approach to guide the machine reading of biomedical literature towards what literature should be read to answer a biomedical query as efficiently as possible. We introduce a family of algorithms for focused reading, including an intuitive, strong baseline, and a second approach which uses a reinforcement learning (RL) framework that learns when to explore (widen the search) or exploit (narrow it). We demonstrate that the RL approach is capable of answering more queries than the baseline, while being more efficient, i.e., reading fewer documents. |
Tasks | Reading Comprehension |
Published | 2017-09-01 |
URL | http://arxiv.org/abs/1709.00149v1 |
http://arxiv.org/pdf/1709.00149v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-what-to-read-focused-machine-reading |
Repo | |
Framework | |
Phylogenetic Tools in Astrophysics
Title | Phylogenetic Tools in Astrophysics |
Authors | Didier Fraix-Burnet |
Abstract | Multivariate clustering in astrophysics is a recent development justified by the bigger and bigger surveys of the sky. The phylogenetic approach is probably the most unexpected technique that has appeared for the unsupervised classification of galaxies, stellar populations or globular clusters. On one side, this is a somewhat natural way of classifying astrophysical entities which are all evolving objects. On the other side, several conceptual and practical difficulties arize, such as the hierarchical representation of the astrophysical diversity, the continuous nature of the parameters, and the adequation of the result to the usual practice for the physical interpretation. Most of these have now been solved through the studies of limited samples of stellar clusters and galaxies. Up to now, only the Maximum Parsimony (cladistics) has been used since it is the simplest and most general phylogenetic technique. Probabilistic and network approaches are obvious extensions that should be explored in the future. |
Tasks | |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00286v1 |
http://arxiv.org/pdf/1703.00286v1.pdf | |
PWC | https://paperswithcode.com/paper/phylogenetic-tools-in-astrophysics |
Repo | |
Framework | |
Feature selection algorithm based on Catastrophe model to improve the performance of regression analysis
Title | Feature selection algorithm based on Catastrophe model to improve the performance of regression analysis |
Authors | Mahdi Zarei |
Abstract | In this paper we introduce a new feature selection algorithm to remove the irrelevant or redundant features in the data sets. In this algorithm the importance of a feature is based on its fitting to the Catastrophe model. Akaike information crite- rion value is used for ranking the features in the data set. The proposed algorithm is compared with well-known RELIEF feature selection algorithm. Breast Cancer, Parkinson Telemonitoring data and Slice locality data sets are used to evaluate the model. |
Tasks | Feature Selection |
Published | 2017-04-21 |
URL | http://arxiv.org/abs/1704.06656v1 |
http://arxiv.org/pdf/1704.06656v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-selection-algorithm-based-on |
Repo | |
Framework | |