February 2, 2020

3319 words 16 mins read

Paper Group AWR 2

Paper Group AWR 2

Uncertainty-Aware Principal Component Analysis. PrivateJobMatch: A Privacy-Oriented Deferred Multi-Match Recommender System for Stable Employment. FaSNet: Low-latency Adaptive Beamforming for Multi-microphone Audio Processing. Induction of Subgoal Automata for Reinforcement Learning. Variational Bayes under Model Misspecification. Cross-lingual Lan …

Uncertainty-Aware Principal Component Analysis

Title Uncertainty-Aware Principal Component Analysis
Authors Jochen Görtler, Thilo Spinner, Dirk Streeb, Daniel Weiskopf, Oliver Deussen
Abstract We present a technique to perform dimensionality reduction on data that is subject to uncertainty. Our method is a generalization of traditional principal component analysis (PCA) to multivariate probability distributions. In comparison to non-linear methods, linear dimensionality reduction techniques have the advantage that the characteristics of such probability distributions remain intact after projection. We derive a representation of the PCA sample covariance matrix that respects potential uncertainty in each of the inputs, building the mathematical foundation of our new method: uncertainty-aware PCA. In addition to the accuracy and performance gained by our approach over sampling-based strategies, our formulation allows us to perform sensitivity analysis with regard to the uncertainty in the data. For this, we propose factor traces as a novel visualization that enables to better understand the influence of uncertainty on the chosen principal components. We provide multiple examples of our technique using real-world datasets. As a special case, we show how to propagate multivariate normal distributions through PCA in closed form. Furthermore, we discuss extensions and limitations of our approach.
Tasks Dimensionality Reduction
Published 2019-05-03
URL https://arxiv.org/abs/1905.01127v4
PDF https://arxiv.org/pdf/1905.01127v4.pdf
PWC https://paperswithcode.com/paper/uncertainty-aware-principal-component
Repo https://github.com/grtlr/uapca
Framework none

PrivateJobMatch: A Privacy-Oriented Deferred Multi-Match Recommender System for Stable Employment

Title PrivateJobMatch: A Privacy-Oriented Deferred Multi-Match Recommender System for Stable Employment
Authors Amar Saini
Abstract Coordination failure reduces match quality among employers and candidates in the job market, resulting in a large number of unfilled positions and/or unstable, short-term employment. Centralized job search engines provide a platform that connects directly employers with job-seekers. However, they require users to disclose a significant amount of personal data, i.e., build a user profile, in order to provide meaningful recommendations. In this paper, we present PrivateJobMatch – a privacy-oriented deferred multi-match recommender system – which generates stable pairings while requiring users to provide only a partial ranking of their preferences. PrivateJobMatch explores a series of adaptations of the game-theoretic Gale-Shapley deferred-acceptance algorithm which combine the flexibility of decentralized markets with the intelligence of centralized matching. We identify the shortcomings of the original algorithm when applied to a job market and propose novel solutions that rely on machine learning techniques. Experimental results on real and synthetic data confirm the benefits of the proposed algorithms across several quality measures. Over the past year, we have implemented a PrivateJobMatch prototype and deployed it in an active job market economy. Using the gathered real-user preference data, we find that the match-recommendations are superior to a typical decentralized job market—while requiring only a partial ranking of the user preferences.
Tasks Recommendation Systems
Published 2019-05-11
URL https://arxiv.org/abs/1905.04564v1
PDF https://arxiv.org/pdf/1905.04564v1.pdf
PWC https://paperswithcode.com/paper/privatejobmatch-a-privacy-oriented-deferred
Repo https://github.com/AmarSaini/PrivateJobMatch
Framework none

FaSNet: Low-latency Adaptive Beamforming for Multi-microphone Audio Processing

Title FaSNet: Low-latency Adaptive Beamforming for Multi-microphone Audio Processing
Authors Yi Luo, Enea Ceolini, Cong Han, Shih-Chii Liu, Nima Mesgarani
Abstract Beamforming has been extensively investigated for multi-channel audio processing tasks. Recently, learning-based beamforming methods, sometimes called \textit{neural beamformers}, have achieved significant improvements in both signal quality (e.g. signal-to-noise ratio (SNR)) and speech recognition (e.g. word error rate (WER)). Such systems are generally non-causal and require a large context for robust estimation of inter-channel features, which is impractical in applications requiring low-latency responses. In this paper, we propose filter-and-sum network (FaSNet), a time-domain, filter-based beamforming approach suitable for low-latency scenarios. FaSNet has a two-stage system design that first learns frame-level time-domain adaptive beamforming filters for a selected reference channel, and then calculate the filters for all remaining channels. The filtered outputs at all channels are summed to generate the final output. Experiments show that despite its small model size, FaSNet is able to outperform several traditional oracle beamformers with respect to scale-invariant signal-to-noise ratio (SI-SNR) in reverberant speech enhancement and separation tasks. Moreover, when trained with a frequency-domain objective function on the CHiME-3 dataset, FaSNet achieves 14.3% relative word error rate reduction (RWERR) compared with the baseline model. These results show the efficacy of FaSNet particularly in reverberant and noisy signal conditions.
Tasks Speech Enhancement, Speech Recognition
Published 2019-09-29
URL https://arxiv.org/abs/1909.13387v2
PDF https://arxiv.org/pdf/1909.13387v2.pdf
PWC https://paperswithcode.com/paper/fasnet-low-latency-adaptive-beamforming-for
Repo https://github.com/yluo42/TAC
Framework pytorch

Induction of Subgoal Automata for Reinforcement Learning

Title Induction of Subgoal Automata for Reinforcement Learning
Authors Daniel Furelos-Blanco, Mark Law, Alessandra Russo, Krysia Broda, Anders Jonsson
Abstract In this work we present ISA, a novel approach for learning and exploiting subgoals in reinforcement learning (RL). Our method relies on inducing an automaton whose transitions are subgoals expressed as propositional formulas over a set of observable events. A state-of-the-art inductive logic programming system is used to learn the automaton from observation traces perceived by the RL agent. The reinforcement learning and automaton learning processes are interleaved: a new refined automaton is learned whenever the RL agent generates a trace not recognized by the current automaton. We evaluate ISA in several gridworld problems and show that it performs similarly to a method for which automata are given in advance. We also show that the learned automata can be exploited to speed up convergence through reward shaping and transfer learning across multiple tasks. Finally, we analyze the running time and the number of traces that ISA needs to learn an automata, and the impact that the number of observable events has on the learner’s performance.
Tasks Transfer Learning
Published 2019-11-29
URL https://arxiv.org/abs/1911.13152v1
PDF https://arxiv.org/pdf/1911.13152v1.pdf
PWC https://paperswithcode.com/paper/induction-of-subgoal-automata-for
Repo https://github.com/ertsiger/induction-subgoal-automata-rl
Framework none

Variational Bayes under Model Misspecification

Title Variational Bayes under Model Misspecification
Authors Yixin Wang, David M. Blei
Abstract Variational Bayes (VB) is a scalable alternative to Markov chain Monte Carlo (MCMC) for Bayesian posterior inference. Though popular, VB comes with few theoretical guarantees, most of which focus on well-specified models. However, models are rarely well-specified in practice. In this work, we study VB under model misspecification. We prove the VB posterior is asymptotically normal and centers at the value that minimizes the Kullback-Leibler (KL) divergence to the true data-generating distribution. Moreover, the VB posterior mean centers at the same value and is also asymptotically normal. These results generalize the variational Bernstein–von Mises theorem [29] to misspecified models. As a consequence of these results, we find that the model misspecification error dominates the variational approximation error in VB posterior predictive distributions. It explains the widely observed phenomenon that VB achieves comparable predictive accuracy with MCMC even though VB uses an approximating family. As illustrations, we study VB under three forms of model misspecification, ranging from model over-/under-dispersion to latent dimensionality misspecification. We conduct two simulation studies that demonstrate the theoretical results.
Published 2019-05-26
URL https://arxiv.org/abs/1905.10859v1
PDF https://arxiv.org/pdf/1905.10859v1.pdf
PWC https://paperswithcode.com/paper/variational-bayes-under-model
Repo https://github.com/yixinwang/vbmisspec-public
Framework none

Cross-lingual Language Model Pretraining

Title Cross-lingual Language Model Pretraining
Authors Guillaume Lample, Alexis Conneau
Abstract Recent studies have demonstrated the efficiency of generative pretraining for English natural language understanding. In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining. We propose two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingual language model objective. We obtain state-of-the-art results on cross-lingual classification, unsupervised and supervised machine translation. On XNLI, our approach pushes the state of the art by an absolute gain of 4.9% accuracy. On unsupervised machine translation, we obtain 34.3 BLEU on WMT’16 German-English, improving the previous state of the art by more than 9 BLEU. On supervised machine translation, we obtain a new state of the art of 38.5 BLEU on WMT’16 Romanian-English, outperforming the previous best approach by more than 4 BLEU. Our code and pretrained models will be made publicly available.
Tasks Language Modelling, Machine Translation, Unsupervised Machine Translation
Published 2019-01-22
URL http://arxiv.org/abs/1901.07291v1
PDF http://arxiv.org/pdf/1901.07291v1.pdf
PWC https://paperswithcode.com/paper/cross-lingual-language-model-pretraining
Repo https://github.com/facebookresearch/MLQA
Framework none

Non-Parallel Voice Conversion with Cyclic Variational Autoencoder

Title Non-Parallel Voice Conversion with Cyclic Variational Autoencoder
Authors Patrick Lumban Tobing, Yi-Chiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda
Abstract In this paper, we present a novel technique for a non-parallel voice conversion (VC) with the use of cyclic variational autoencoder (CycleVAE)-based spectral modeling. In a variational autoencoder(VAE) framework, a latent space, usually with a Gaussian prior, is used to encode a set of input features. In a VAE-based VC, the encoded latent features are fed into a decoder, along with speaker-coding features, to generate estimated spectra with either the original speaker identity (reconstructed) or another speaker identity (converted). Due to the non-parallel modeling condition, the converted spectra can not be directly optimized, which heavily degrades the performance of a VAE-based VC. In this work, to overcome this problem, we propose to use CycleVAE-based spectral model that indirectly optimizes the conversion flow by recycling the converted features back into the system to obtain corresponding cyclic reconstructed spectra that can be directly optimized. The cyclic flow can be continued by using the cyclic reconstructed features as input for the next cycle. The experimental results demonstrate the effectiveness of the proposed CycleVAE-based VC, which yields higher accuracy of converted spectra, generates latent features with higher correlation degree, and significantly improves the quality and conversion accuracy of the converted speech.
Tasks Voice Conversion
Published 2019-07-24
URL https://arxiv.org/abs/1907.10185v1
PDF https://arxiv.org/pdf/1907.10185v1.pdf
PWC https://paperswithcode.com/paper/non-parallel-voice-conversion-with-cyclic
Repo https://github.com/patrickltobing/cyclevae-vc
Framework pytorch

Hierarchical Optimal Transport for Multimodal Distribution Alignment

Title Hierarchical Optimal Transport for Multimodal Distribution Alignment
Authors John Lee, Max Dabagia, Eva L. Dyer, Christopher J. Rozell
Abstract In many machine learning applications, it is necessary to meaningfully aggregate, through alignment, different but related datasets. Optimal transport (OT)-based approaches pose alignment as a divergence minimization problem: the aim is to transform a source dataset to match a target dataset using the Wasserstein distance as a divergence measure. We introduce a hierarchical formulation of OT which leverages clustered structure in data to improve alignment in noisy, ambiguous, or multimodal settings. To solve this numerically, we propose a distributed ADMM algorithm that also exploits the Sinkhorn distance, thus it has an efficient computational complexity that scales quadratically with the size of the largest cluster. When the transformation between two datasets is unitary, we provide performance guarantees that describe when and how well aligned cluster correspondences can be recovered with our formulation, as well as provide worst-case dataset geometry for such a strategy. We apply this method to synthetic datasets that model data as mixtures of low-rank Gaussians and study the impact that different geometric properties of the data have on alignment. Next, we applied our approach to a neural decoding application where the goal is to predict movement directions and instantaneous velocities from populations of neurons in the macaque primary motor cortex. Our results demonstrate that when clustered structure exists in datasets, and is consistent across trials or time points, a hierarchical alignment strategy that leverages such structure can provide significant improvements in cross-domain alignment.
Published 2019-06-27
URL https://arxiv.org/abs/1906.11768v2
PDF https://arxiv.org/pdf/1906.11768v2.pdf
PWC https://paperswithcode.com/paper/hierarchical-optimal-transport-for-multimodal
Repo https://github.com/siplab-gt/hiwa-matlab
Framework none
Title Periodic Spectral Ergodicity: A Complexity Measure for Deep Neural Networks and Neural Architecture Search
Authors Mehmet Süzen, J. J. Cerdà, Cornelius Weber
Abstract Establishing associations between the structure and the generalisation ability of deep neural networks (DNNs) is a challenging task in modern machine learning. Producing solutions to this challenge will bring progress both in the theoretical understanding of DNNs and in building new architectures efficiently. In this work, we address this challenge by developing a new complexity measure based on the concept of {Periodic Spectral Ergodicity} (PSE) originating from quantum statistical mechanics. Based on this measure a technique is devised to quantify the complexity of deep neural networks from the learned weights and traversing the network connectivity in a sequential manner, hence the term cascading PSE (cPSE), as an empirical complexity measure. This measure will capture both topological and internal neural processing complexity simultaneously. Because of this cascading approach, i.e., a symmetric divergence of PSE on the consecutive layers, it is possible to use this measure for Neural Architecture Search (NAS). We demonstrate the usefulness of this measure in practice on two sets of vision models, ResNet and VGG, and sketch the computation of cPSE for more complex network structures.
Tasks Neural Architecture Search
Published 2019-11-10
URL https://arxiv.org/abs/1911.07831v3
PDF https://arxiv.org/pdf/1911.07831v3.pdf
PWC https://paperswithcode.com/paper/periodic-spectral-ergodicity-a-complexity
Repo https://github.com/msuzen/bristol
Framework pytorch

Predicting Fluid Intelligence of Children using T1-weighted MR Images and a StackNet

Title Predicting Fluid Intelligence of Children using T1-weighted MR Images and a StackNet
Authors Po-Yu Kao, Angela Zhang, Michael Goebel, Jefferson W. Chen, B. S. Manjunath
Abstract In this work, we utilize T1-weighted MR images and StackNet to predict fluid intelligence in adolescents. Our framework includes feature extraction, feature normalization, feature denoising, feature selection, training a StackNet, and predicting fluid intelligence. The extracted feature is the distribution of different brain tissues in different brain parcellation regions. The proposed StackNet consists of three layers and 11 models. Each layer uses the predictions from all previous layers including the input layer. The proposed StackNet is tested on a public benchmark Adolescent Brain Cognitive Development Neurocognitive Prediction Challenge 2019 and achieves a mean squared error of 82.42 on the combined training and validation set with 10-fold cross-validation. In addition, the proposed StackNet also achieves a mean squared error of 94.25 on the testing data. The source code is available on GitHub.
Tasks Denoising, Feature Selection
Published 2019-04-16
URL https://arxiv.org/abs/1904.07387v2
PDF https://arxiv.org/pdf/1904.07387v2.pdf
PWC https://paperswithcode.com/paper/predicting-fluid-intelligence-of-children
Repo https://github.com/pykao/ABCD-MICCAI2019
Framework none

HuggingFace’s Transformers: State-of-the-art Natural Language Processing

Title HuggingFace’s Transformers: State-of-the-art Natural Language Processing
Authors Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Jamie Brew
Abstract Recent advances in modern Natural Language Processing (NLP) research have been dominated by the combination of Transfer Learning methods with large-scale language models, in particular based on the Transformer architecture. With them came a paradigm shift in NLP with the starting point for training a model on a downstream task moving from a blank specific model to a general-purpose pretrained architecture. Still, creating these general-purpose models remains an expensive and time-consuming process restricting the use of these methods to a small sub-set of the wider NLP community. In this paper, we present HuggingFace’s Transformers library, a library for state-of-the-art NLP, making these developments available to the community by gathering state-of-the-art general-purpose pretrained models under a unified API together with an ecosystem of libraries, examples, tutorials and scripts targeting many downstream NLP tasks. HuggingFace’s Transformers library features carefully crafted model implementations and high-performance pretrained weights for two main deep learning frameworks, PyTorch and TensorFlow, while supporting all the necessary tools to analyze, evaluate and use these models in downstream tasks such as text/token classification, questions answering and language generation among others. The library has gained significant organic traction and adoption among both the researcher and practitioner communities. We are committed at HuggingFace to pursue the efforts to develop this toolkit with the ambition of creating the standard library for building NLP systems. HuggingFace’s Transformers library is available at \url{https://github.com/huggingface/transformers}.
Tasks Text Generation, Transfer Learning
Published 2019-10-09
URL https://arxiv.org/abs/1910.03771v4
PDF https://arxiv.org/pdf/1910.03771v4.pdf
PWC https://paperswithcode.com/paper/transformers-state-of-the-art-natural
Repo https://github.com/huggingface/transformers
Framework pytorch

Explore Entity Embedding Effectiveness in Entity Retrieval

Title Explore Entity Embedding Effectiveness in Entity Retrieval
Authors Zhenghao Liu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu
Abstract This paper explores entity embedding effectiveness in ad-hoc entity retrieval, which introduces distributed representation of entities into entity retrieval. The knowledge graph contains lots of knowledge and models entity semantic relations with the well-formed structural representation. Entity embedding learns lots of semantic information from the knowledge graph and represents entities with a low-dimensional representation, which provides an opportunity to establish interactions between query related entities and candidate entities for entity retrieval. Our experiments demonstrate the effectiveness of entity embedding based model, which achieves more than 5% improvement than the previous state-of-the-art learning to rank based entity retrieval model. Our further analysis reveals that the entity semantic match feature effective, especially for the scenario which needs more semantic understanding.
Tasks Learning-To-Rank
Published 2019-08-28
URL https://arxiv.org/abs/1908.10554v1
PDF https://arxiv.org/pdf/1908.10554v1.pdf
PWC https://paperswithcode.com/paper/explore-entity-embedding-effectiveness-in
Repo https://github.com/thunlp/EmbeddingEntityRetrieval
Framework none

Domain Adaptation for Structured Output via Discriminative Patch Representations

Title Domain Adaptation for Structured Output via Discriminative Patch Representations
Authors Yi-Hsuan Tsai, Kihyuk Sohn, Samuel Schulter, Manmohan Chandraker
Abstract Predicting structured outputs such as semantic segmentation relies on expensive per-pixel annotations to learn supervised models like convolutional neural networks. However, models trained on one data domain may not generalize well to other domains without annotations for model finetuning. To avoid the labor-intensive process of annotation, we develop a domain adaptation method to adapt the source data to the unlabeled target domain. We propose to learn discriminative feature representations of patches in the source domain by discovering multiple modes of patch-wise output distribution through the construction of a clustered space. With such representations as guidance, we use an adversarial learning scheme to push the feature representations of target patches in the clustered space closer to the distributions of source patches. In addition, we show that our framework is complementary to existing domain adaptation techniques and achieves consistent improvements on semantic segmentation. Extensive ablations and results are demonstrated on numerous benchmark datasets with various settings, such as synthetic-to-real and cross-city scenarios.
Tasks Domain Adaptation, Image-to-Image Translation, Semantic Segmentation, Synthetic-to-Real Translation
Published 2019-01-16
URL https://arxiv.org/abs/1901.05427v4
PDF https://arxiv.org/pdf/1901.05427v4.pdf
PWC https://paperswithcode.com/paper/domain-adaptation-for-structured-output-via
Repo https://github.com/lym29/DASeg
Framework pytorch
Title Machine learning and chord based feature engineering for genre prediction in popular Brazilian music
Authors Bruna D. Wundervald, Walmes M. Zeviani
Abstract Music genre can be hard to describe: many factors are involved, such as style, music technique, and historical context. Some genres even have overlapping characteristics. Looking for a better understanding of how music genres are related to musical harmonic structures, we gathered data about the music chords for thousands of popular Brazilian songs. Here, ‘popular’ does not only refer to the genre named MPB (Brazilian Popular Music) but to nine different genres that were considered particular to the Brazilian case. The main goals of the present work are to extract and engineer harmonically related features from chords data and to use it to classify popular Brazilian music genres towards establishing a connection between harmonic relationships and Brazilian genres. We also emphasize the generalization of the method for obtaining the data, allowing for the replication and direct extension of this work. Our final model is a combination of multiple classification trees, also known as the random forest model. We found that features extracted from harmonic elements can satisfactorily predict music genre for the Brazilian case, as well as features obtained from the Spotify API. The variables considered in this work also give an intuition about how they relate to the genres.
Tasks Feature Engineering, Music Genre Recognition
Published 2019-02-08
URL http://arxiv.org/abs/1902.03283v1
PDF http://arxiv.org/pdf/1902.03283v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-and-chord-based-feature
Repo https://github.com/brunaw/genre_classification
Framework none

Beholder-GAN: Generation and Beautification of Facial Images with Conditioning on Their Beauty Level

Title Beholder-GAN: Generation and Beautification of Facial Images with Conditioning on Their Beauty Level
Authors Nir Diamant, Dean Zadok, Chaim Baskin, Eli Schwartz, Alex M. Bronstein
Abstract Beauty is in the eye of the beholder. This maxim, emphasizing the subjectivity of the perception of beauty, has enjoyed a wide consensus since ancient times. In the digitalera, data-driven methods have been shown to be able to predict human-assigned beauty scores for facial images. In this work, we augment this ability and train a generative model that generates faces conditioned on a requested beauty score. In addition, we show how this trained generator can be used to beautify an input face image. By doing so, we achieve an unsupervised beautification model, in the sense that it relies on no ground truth target images.
Published 2019-02-07
URL http://arxiv.org/abs/1902.02593v3
PDF http://arxiv.org/pdf/1902.02593v3.pdf
PWC https://paperswithcode.com/paper/beholder-gan-generation-and-beautification-of
Repo https://github.com/deanzadok/Beholder-GAN
Framework pytorch
comments powered by Disqus