Paper Group ANR 1429
Developmentally motivated emergence of compositional communication via template transfer. Research on Clustering Performance of Sparse Subspace Clustering. LEMO: Learn to Equalize for MIMO-OFDM Systems with Low-Resolution ADCs. BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling. Towards Generating Long and Coherent Text with Mu …
Developmentally motivated emergence of compositional communication via template transfer
Title | Developmentally motivated emergence of compositional communication via template transfer |
Authors | Tomasz Korbak, Julian Zubek, Łukasz Kuciński, Piotr Miłoś, Joanna Rączaszek-Leonardi |
Abstract | This paper explores a novel approach to achieving emergent compositional communication in multi-agent systems. We propose a training regime implementing template transfer, the idea of carrying over learned biases across contexts. In our method, a sender-receiver pair is first trained with disentangled loss functions and then the receiver is transferred to train a new sender with a standard loss. Unlike other methods (e.g. the obverter algorithm), our approach does not require imposing inductive biases on the architecture of the agents. We experimentally show the emergence of compositional communication using topographical similarity, zero-shot generalization and context independence as evaluation metrics. The presented approach is connected to an important line of work in semiotics and developmental psycholinguistics: it supports a conjecture that compositional communication is scaffolded on simpler communication protocols. |
Tasks | |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.06079v1 |
https://arxiv.org/pdf/1910.06079v1.pdf | |
PWC | https://paperswithcode.com/paper/developmentally-motivated-emergence-of |
Repo | |
Framework | |
Research on Clustering Performance of Sparse Subspace Clustering
Title | Research on Clustering Performance of Sparse Subspace Clustering |
Authors | Wen-Jin Fu, Xiao-Jun Wu, He-Feng Yin, Wen-Bo Hu |
Abstract | Recently, sparse subspace clustering has been a valid tool to deal with high-dimensional data. There are two essential steps in the framework of sparse subspace clustering. One is solving the coefficient matrix of data, and the other is constructing the affinity matrix from the coefficient matrix, which is applied to the spectral clustering. This paper investigates the factors which affect clustering performance from both clustering accuracy and stability of the approaches based on existing algorithms. We select four methods to solve the coefficient matrix and use four different ways to construct a similarity matrix for each coefficient matrix. Then we compare the clustering performance of different combinations on three datasets. The experimental results indicate that both the coefficient matrix and affinity matrix have a huge influence on clustering performance and how to develop a stable and valid algorithm still needs to be studied. |
Tasks | |
Published | 2019-12-21 |
URL | https://arxiv.org/abs/1912.10256v1 |
https://arxiv.org/pdf/1912.10256v1.pdf | |
PWC | https://paperswithcode.com/paper/research-on-clustering-performance-of-sparse |
Repo | |
Framework | |
LEMO: Learn to Equalize for MIMO-OFDM Systems with Low-Resolution ADCs
Title | LEMO: Learn to Equalize for MIMO-OFDM Systems with Low-Resolution ADCs |
Authors | Lei Chu, Husheng Li, Robert Caiming Qiu |
Abstract | This paper develops a new deep neural network optimized equalization framework for massive multiple input multiple output orthogonal frequency division multiplexing (MIMO-OFDM) systems that employ low-resolution analog-to-digital converters (ADCs) at the base station (BS). The use of low-resolution ADCs could largely reduce hardware complexity and circuit power consumption, however, makes the channel station information almost blind to the BS, hence causing difficulty in solving the equalization problem. In this paper, we consider a supervised learning architecture, where the goal is to learn a representative function that can predict the targets (constellation points) from the inputs (outputs of the low-resolution ADCs) based on the labeled training data (pilot signals). Specially, our main contributions are two-fold: 1) First, we design a new activation function, whose outputs are close to the constellation points when the parameters are finally optimized, to help us fully exploit the stochastic gradient descent method for the discrete optimization problem. 2) Second, an unsupervised loss is designed and then added to the optimization objective, aiming to enhance the representation ability (so-called generalization). The experimental results reveal that the proposed equalizer is robust to different channel taps (i.e., Gaussian, and Poisson), significantly outperforms the linearized MMSE equalizer, and shows potential for pilot saving. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.06329v1 |
https://arxiv.org/pdf/1905.06329v1.pdf | |
PWC | https://paperswithcode.com/paper/lemo-learn-to-equalize-for-mimo-ofdm-systems |
Repo | |
Framework | |
BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling
Title | BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling |
Authors | Lars Maaløe, Marco Fraccaro, Valentin Liévin, Ole Winther |
Abstract | With the introduction of the variational autoencoder (VAE), probabilistic latent variable models have received renewed attention as powerful generative models. However, their performance in terms of test likelihood and quality of generated samples has been surpassed by autoregressive models without stochastic units. Furthermore, flow-based models have recently been shown to be an attractive alternative that scales well to high-dimensional data. In this paper we close the performance gap by constructing VAE models that can effectively utilize a deep hierarchy of stochastic variables and model complex covariance structures. We introduce the Bidirectional-Inference Variational Autoencoder (BIVA), characterized by a skip-connected generative model and an inference network formed by a bidirectional stochastic inference path. We show that BIVA reaches state-of-the-art test likelihoods, generates sharp and coherent natural images, and uses the hierarchy of latent variables to capture different aspects of the data distribution. We observe that BIVA, in contrast to recent results, can be used for anomaly detection. We attribute this to the hierarchy of latent variables which is able to extract high-level semantic features. Finally, we extend BIVA to semi-supervised classification tasks and show that it performs comparably to state-of-the-art results by generative adversarial networks. |
Tasks | Anomaly Detection, Latent Variable Models |
Published | 2019-02-06 |
URL | https://arxiv.org/abs/1902.02102v3 |
https://arxiv.org/pdf/1902.02102v3.pdf | |
PWC | https://paperswithcode.com/paper/biva-a-very-deep-hierarchy-of-latent |
Repo | |
Framework | |
Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models
Title | Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models |
Authors | Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Jianfeng Gao, Lawrence Carin |
Abstract | Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation with latent variables. In this paper, we investigate several multi-level structures to learn a VAE model to generate long, and coherent text. In particular, we use a hierarchy of stochastic layers between the encoder and decoder networks to generate more informative latent codes. We also investigate a multi-level decoder structure to learn a coherent long-term structure by generating intermediate sentence representations as high-level plan vectors. Empirical results demonstrate that a multi-level VAE model produces more coherent and less repetitive long text compared to the standard VAE models and can further mitigate the posterior-collapse issue. |
Tasks | Latent Variable Models, Text Generation |
Published | 2019-02-01 |
URL | https://arxiv.org/abs/1902.00154v2 |
https://arxiv.org/pdf/1902.00154v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-generating-long-and-coherent-text |
Repo | |
Framework | |
Multimodal Sparse Classifier for Adolescent Brain Age Prediction
Title | Multimodal Sparse Classifier for Adolescent Brain Age Prediction |
Authors | Peyman Hosseinzadeh Kassani, Alexej Gossmann, Yu-Ping Wang |
Abstract | The study of healthy brain development helps to better understand the brain transformation and brain connectivity patterns which happen during childhood to adulthood. This study presents a sparse machine learning solution across whole-brain functional connectivity (FC) measures of three sets of data, derived from resting state functional magnetic resonance imaging (rs-fMRI) and task fMRI data, including a working memory n-back task (nb-fMRI) and an emotion identification task (em-fMRI). These multi-modal image data are collected on a sample of adolescents from the Philadelphia Neurodevelopmental Cohort (PNC) for the prediction of brain ages. Due to extremely large variable-to-instance ratio of PNC data, a high dimensional matrix with several irrelevant and highly correlated features is generated and hence a pattern learning approach is necessary to extract significant features. We propose a sparse learner based on the residual errors along the estimation of an inverse problem for the extreme learning machine (ELM) neural network. The purpose of the approach is to overcome the overlearning problem through pruning of several redundant features and their corresponding output weights. The proposed multimodal sparse ELM classifier based on residual errors (RES-ELM) is highly competitive in terms of the classification accuracy compared to its counterparts such as conventional ELM, and sparse Bayesian learning ELM. |
Tasks | |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.01070v1 |
http://arxiv.org/pdf/1904.01070v1.pdf | |
PWC | https://paperswithcode.com/paper/multimodal-sparse-classifier-for-adolescent |
Repo | |
Framework | |
Relationship Explainable Multi-objective Optimization Via Vector Value Function Based Reinforcement Learning
Title | Relationship Explainable Multi-objective Optimization Via Vector Value Function Based Reinforcement Learning |
Authors | Huixin Zhan, Yongcan Cao |
Abstract | Solving multi-objective optimization problems is important in various applications where users are interested in obtaining optimal policies subject to multiple, yet often conflicting objectives. A typical approach to obtain optimal policies is to first construct a loss function that is based on the scalarization of individual objectives, and then find the optimal policy that minimizes the loss. However, optimizing the scalarized (and weighted) loss does not necessarily provide a guarantee of high performance on each possibly conflicting objective. In this paper, we propose a vector value based reinforcement learning approach that seeks to explicitly learn the inter-objective relationship and optimize multiple objectives based on the learned relationship. In particular, the proposed method is to first define relationship matrix, a mathematical representation of the inter-objective relationship, and then create one actor and multiple critics that can co-learn the relationship matrix and action selection. The proposed approach can quantify the inter-objective relationship via reinforcement learning when the impact of one objective on another is unknown a prior. We also provide rigorous convergence analysis of the proposed approach and present a quantitative evaluation of the approach based on two testing scenarios. |
Tasks | |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.01919v1 |
https://arxiv.org/pdf/1910.01919v1.pdf | |
PWC | https://paperswithcode.com/paper/relationship-explainable-multi-objective-1 |
Repo | |
Framework | |
Preventing Posterior Collapse with delta-VAEs
Title | Preventing Posterior Collapse with delta-VAEs |
Authors | Ali Razavi, Aäron van den Oord, Ben Poole, Oriol Vinyals |
Abstract | Due to the phenomenon of “posterior collapse,” current latent variable generative models pose a challenging design choice that either weakens the capacity of the decoder or requires augmenting the objective so it does not only maximize the likelihood of the data. In this paper, we propose an alternative that utilizes the most powerful generative models as decoders, whilst optimising the variational lower bound all while ensuring that the latent variables preserve and encode useful information. Our proposed $\delta$-VAEs achieve this by constraining the variational family for the posterior to have a minimum distance to the prior. For sequential latent variable models, our approach resembles the classic representation learning approach of slow feature analysis. We demonstrate the efficacy of our approach at modeling text on LM1B and modeling images: learning representations, improving sample quality, and achieving state of the art log-likelihood on CIFAR-10 and ImageNet $32\times 32$. |
Tasks | Latent Variable Models, Representation Learning |
Published | 2019-01-10 |
URL | http://arxiv.org/abs/1901.03416v1 |
http://arxiv.org/pdf/1901.03416v1.pdf | |
PWC | https://paperswithcode.com/paper/preventing-posterior-collapse-with-delta-vaes |
Repo | |
Framework | |
A preference learning framework for multiple criteria sorting with diverse additive value models and valued assignment examples
Title | A preference learning framework for multiple criteria sorting with diverse additive value models and valued assignment examples |
Authors | Jiapeng Liu, Milosz Kadzinski, Xiuwu Liao, Xiaoxin Mao, Yao Wang |
Abstract | We present a preference learning framework for multiple criteria sorting. We consider sorting procedures applying an additive value model with diverse types of marginal value functions (including linear, piecewise-linear, splined, and general monotone ones) under a unified analytical framework. Differently from the existing sorting methods that infer a preference model from crisp decision examples, where each reference alternative is assigned to a unique class, our framework allows to consider valued assignment examples in which a reference alternative can be classified into multiple classes with respective credibility degrees. We propose an optimization model for constructing a preference model from such valued examples by maximizing the credible consistency among reference alternatives. To improve the predictive ability of the constructed model on new instances, we employ the regularization techniques. Moreover, to enhance the capability of addressing large-scale datasets, we introduce a state-of-the-art algorithm that is widely used in the machine learning community to solve the proposed optimization model in a computationally efficient way. Using the constructed additive value model, we determine both crisp and valued assignments for non-reference alternatives. Moreover, we allow the Decision Maker to prioritize importance of classes and give the method a flexibility to adjust classification performance across classes according to the specified priorities. The practical usefulness of the analytical framework is demonstrated on a real-world dataset by comparing it to several existing sorting methods. |
Tasks | |
Published | 2019-10-12 |
URL | https://arxiv.org/abs/1910.05485v1 |
https://arxiv.org/pdf/1910.05485v1.pdf | |
PWC | https://paperswithcode.com/paper/a-preference-learning-framework-for-multiple |
Repo | |
Framework | |
Adaptive Density Estimation for Generative Models
Title | Adaptive Density Estimation for Generative Models |
Authors | Thomas Lucas, Konstantin Shmelkov, Karteek Alahari, Cordelia Schmid, Jakob Verbeek |
Abstract | Unsupervised learning of generative models has seen tremendous progress over recent years, in particular due to generative adversarial networks (GANs), variational autoencoders, and flow-based models. GANs have dramatically improved sample quality, but suffer from two drawbacks: (i) they mode-drop, i.e., do not cover the full support of the train data, and (ii) they do not allow for likelihood evaluations on held-out data. In contrast, likelihood-based training encourages models to cover the full support of the train data, but yields poorer samples. These mutual shortcomings can in principle be addressed by training generative latent variable models in a hybrid adversarial-likelihood manner. However, we show that commonly made parametric assumptions create a conflict between them, making successful hybrid models non trivial. As a solution, we propose to use deep invertible transformations in the latent variable decoder. This approach allows for likelihood computations in image space, is more efficient than fully invertible models, and can take full advantage of adversarial training. We show that our model significantly improves over existing hybrid models: offering GAN-like samples, IS and FID scores that are competitive with fully adversarial models, and improved likelihood scores. |
Tasks | Density Estimation, Latent Variable Models |
Published | 2019-01-04 |
URL | https://arxiv.org/abs/1901.01091v3 |
https://arxiv.org/pdf/1901.01091v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-training-of-partially-invertible |
Repo | |
Framework | |
conLSH: Context based Locality Sensitive Hashing for Mapping of noisy SMRT Reads
Title | conLSH: Context based Locality Sensitive Hashing for Mapping of noisy SMRT Reads |
Authors | Angana Chakraborty, Sanghamitra Bandyopadhyay |
Abstract | Single Molecule Real-Time (SMRT) sequencing is a recent advancement of Next Gen technology developed by Pacific Bio (PacBio). It comes with an explosion of long and noisy reads demanding cutting edge research to get most out of it. To deal with the high error probability of SMRT data, a novel contextual Locality Sensitive Hashing (conLSH) based algorithm is proposed in this article, which can effectively align the noisy SMRT reads to the reference genome. Here, sequences are hashed together based not only on their closeness, but also on similarity of context. The algorithm has $\mathcal{O}(n^{\rho+1})$ space requirement, where $n$ is the number of sequences in the corpus and $\rho$ is a constant. The indexing time and querying time are bounded by $\mathcal{O}( \frac{n^{\rho+1} \cdot \ln n}{\ln \frac{1}{P_2}})$ and $\mathcal{O}(n^\rho)$ respectively, where $P_2 > 0$, is a probability value. This algorithm is particularly useful for retrieving similar sequences, a widely used task in biology. The proposed conLSH based aligner is compared with rHAT, popularly used for aligning SMRT reads, and is found to comprehensively beat it in speed as well as in memory requirements. In particular, it takes approximately $24.2%$ less processing time, while saving about $70.3%$ in peak memory requirement for H.sapiens PacBio dataset. |
Tasks | |
Published | 2019-03-11 |
URL | http://arxiv.org/abs/1903.04925v1 |
http://arxiv.org/pdf/1903.04925v1.pdf | |
PWC | https://paperswithcode.com/paper/conlsh-context-based-locality-sensitive |
Repo | |
Framework | |
Copying Machine Learning Classifiers
Title | Copying Machine Learning Classifiers |
Authors | Irene Unceta, Jordi Nin, Oriol Pujol |
Abstract | We study model-agnostic copies of machine learning classifiers. We develop the theory behind the problem of copying, highlighting its differences with that of learning, and propose a framework to copy the functionality of any classifier using no prior knowledge of its parameters or training data distribution. We identify the different sources of loss and provide guidelines on how best to generate synthetic sets for the copying process. We further introduce a set of metrics to evaluate copies in practice. We validate our framework through extensive experiments using data from a series of well-known problems. We demonstrate the value of copies in use cases where desiderata such as interpretability, fairness or productivization constrains need to be addressed. Results show that copies can be exploited to enhance existing solutions and improve them adding new features and characteristics. |
Tasks | |
Published | 2019-03-05 |
URL | https://arxiv.org/abs/1903.01879v2 |
https://arxiv.org/pdf/1903.01879v2.pdf | |
PWC | https://paperswithcode.com/paper/copying-machine-learning-classifiers |
Repo | |
Framework | |
Factorized MultiClass Boosting
Title | Factorized MultiClass Boosting |
Authors | Igor E. Kuralenok, Yurii Rebryk, Ruslan Solovev, Anton Ermilov |
Abstract | In this paper, we introduce a new approach to multiclass classification problem. We decompose the problem into a series of regression tasks, that are solved with CART trees. The proposed method works significantly faster than state-of-the-art solutions while giving the same level of model quality. The algorithm is also robust to imbalanced datasets, allowing to reach high-quality results in significantly less time without class re-balancing. |
Tasks | |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.04904v1 |
https://arxiv.org/pdf/1909.04904v1.pdf | |
PWC | https://paperswithcode.com/paper/factorized-multiclass-boosting |
Repo | |
Framework | |
Examining Structure of Word Embeddings with PCA
Title | Examining Structure of Word Embeddings with PCA |
Authors | Tomáš Musil |
Abstract | In this paper we compare structure of Czech word embeddings for English-Czech neural machine translation (NMT), word2vec and sentiment analysis. We show that although it is possible to successfully predict part of speech (POS) tags from word embeddings of word2vec and various translation models, not all of the embedding spaces show the same structure. The information about POS is present in word2vec embeddings, but the high degree of organization by POS in the NMT decoder suggests that this information is more important for machine translation and therefore the NMT model represents it in more direct way. Our method is based on correlation of principal component analysis (PCA) dimensions with categorical linguistic data. We also show that further examining histograms of classes along the principal component is important to understand the structure of representation of information in embeddings. |
Tasks | Machine Translation, Sentiment Analysis, Word Embeddings |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1906.00114v1 |
https://arxiv.org/pdf/1906.00114v1.pdf | |
PWC | https://paperswithcode.com/paper/190600114 |
Repo | |
Framework | |
Variable Importance Clouds: A Way to Explore Variable Importance for the Set of Good Models
Title | Variable Importance Clouds: A Way to Explore Variable Importance for the Set of Good Models |
Authors | Jiayun Dong, Cynthia Rudin |
Abstract | Variable importance is central to scientific studies, including the social sciences and causal inference, healthcare, and other domains. However, current notions of variable importance are often tied to a specific predictive model. This is problematic: what if there were multiple well-performing predictive models, and a specific variable is important to some of them and not to others? In that case, we may not be able to tell from a single well-performing model whether a variable is always important in predicting the outcome. Rather than depending on variable importance for a single predictive model, we would like to explore variable importance for all approximately-equally-accurate predictive models. This work introduces the concept of a variable importance cloud, which maps every variable to its importance for every good predictive model. We show properties of the variable importance cloud and draw connections to other areas of statistics. We introduce variable importance diagrams as a projection of the variable importance cloud into two dimensions for visualization purposes. Experiments with criminal justice, marketing data, and image classification tasks illustrate how variables can change dramatically in importance for approximately-equally-accurate predictive models |
Tasks | Causal Inference, Image Classification |
Published | 2019-01-10 |
URL | https://arxiv.org/abs/1901.03209v2 |
https://arxiv.org/pdf/1901.03209v2.pdf | |
PWC | https://paperswithcode.com/paper/variable-importance-clouds-a-way-to-explore |
Repo | |
Framework | |