January 30, 2020

3069 words 15 mins read

Paper Group ANR 312

Paper Group ANR 312

Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees. Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning. Synchronous Transformers for End-to-End Speech Recognition. The sharp, the flat and the shallow: Can weakly interacting agents learn to escape bad …

Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees

Title Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees
Authors Mohammadhosein Hasanbeig, Yiannis Kantaros, Alessandro Abate, Daniel Kroening, George J. Pappas, Insup Lee
Abstract Reinforcement Learning (RL) has emerged as an efficient method of choice for solving complex sequential decision making problems in automatic control, computer science, economics, and biology. In this paper we present a model-free RL algorithm to synthesize control policies that maximize the probability of satisfying high-level control objectives given as Linear Temporal Logic (LTL) formulas. Uncertainty is considered in the workspace properties, the structure of the workspace, and the agent actions, giving rise to a Probabilistically-Labeled Markov Decision Process (PL-MDP) with unknown graph structure and stochastic behaviour, which is even more general case than a fully unknown MDP. We first translate the LTL specification into a Limit Deterministic Buchi Automaton (LDBA), which is then used in an on-the-fly product with the PL-MDP. Thereafter, we define a synchronous reward function based on the acceptance condition of the LDBA. Finally, we show that the RL algorithm delivers a policy that maximizes the satisfaction probability asymptotically. We provide experimental results that showcase the efficiency of the proposed method.
Tasks Decision Making
Published 2019-09-11
URL https://arxiv.org/abs/1909.05304v1
PDF https://arxiv.org/pdf/1909.05304v1.pdf
PWC https://paperswithcode.com/paper/reinforcement-learning-for-temporal-logic
Repo
Framework

Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning

Title Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning
Authors Valerio Perrone, Huibin Shen, Matthias Seeger, Cedric Archambeau, Rodolphe Jenatton
Abstract Bayesian optimization (BO) is a successful methodology to optimize black-box functions that are expensive to evaluate. While traditional methods optimize each black-box function in isolation, there has been recent interest in speeding up BO by transferring knowledge across multiple related black-box functions. In this work, we introduce a method to automatically design the BO search space by relying on evaluations of previous black-box functions. We depart from the common practice of defining a set of arbitrary search ranges a priori by considering search space geometries that are learned from historical data. This simple, yet effective strategy can be used to endow many existing BO methods with transfer learning properties. Despite its simplicity, we show that our approach considerably boosts BO by reducing the size of the search space, thus accelerating the optimization of a variety of black-box optimization problems. In particular, the proposed approach combined with random search results in a parameter-free, easy-to-implement, robust hyperparameter optimization strategy. We hope it will constitute a natural baseline for further research attempting to warm-start BO.
Tasks Hyperparameter Optimization, Transfer Learning
Published 2019-09-27
URL https://arxiv.org/abs/1909.12552v1
PDF https://arxiv.org/pdf/1909.12552v1.pdf
PWC https://paperswithcode.com/paper/learning-search-spaces-for-bayesian
Repo
Framework

Synchronous Transformers for End-to-End Speech Recognition

Title Synchronous Transformers for End-to-End Speech Recognition
Authors Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen
Abstract For most of the attention-based sequence-to-sequence models, the decoder predicts the output sequence conditioned on the entire input sequence processed by the encoder. The asynchronous problem between the encoding and decoding makes these models difficult to be applied for online speech recognition. In this paper, we propose a model named synchronous transformer to address this problem, which can predict the output sequence chunk by chunk. Once a fixed-length chunk of the input sequence is processed by the encoder, the decoder begins to predict symbols immediately. During training, a forward-backward algorithm is introduced to optimize all the possible alignment paths. Our model is evaluated on a Mandarin dataset AISHELL-1. The experiments show that the synchronous transformer is able to perform encoding and decoding synchronously, and achieves a character error rate of 8.91% on the test set.
Tasks End-To-End Speech Recognition, Speech Recognition
Published 2019-12-06
URL https://arxiv.org/abs/1912.02958v2
PDF https://arxiv.org/pdf/1912.02958v2.pdf
PWC https://paperswithcode.com/paper/synchronous-transformers-for-end-to-end
Repo
Framework

The sharp, the flat and the shallow: Can weakly interacting agents learn to escape bad minima?

Title The sharp, the flat and the shallow: Can weakly interacting agents learn to escape bad minima?
Authors Nikolas Kantas, Panos Parpas, Grigorios A. Pavliotis
Abstract An open problem in machine learning is whether flat minima generalize better and how to compute such minima efficiently. This is a very challenging problem. As a first step towards understanding this question we formalize it as an optimization problem with weakly interacting agents. We review appropriate background material from the theory of stochastic processes and provide insights that are relevant to practitioners. We propose an algorithmic framework for an extended stochastic gradient Langevin dynamics and illustrate its potential. The paper is written as a tutorial, and presents an alternative use of multi-agent learning. Our primary focus is on the design of algorithms for machine learning applications; however the underlying mathematical framework is suitable for the understanding of large scale systems of agent based models that are popular in the social sciences, economics and finance.
Tasks
Published 2019-05-10
URL https://arxiv.org/abs/1905.04121v1
PDF https://arxiv.org/pdf/1905.04121v1.pdf
PWC https://paperswithcode.com/paper/the-sharp-the-flat-and-the-shallow-can-weakly
Repo
Framework

Bimodal Stereo: Joint Shape and Pose Estimation from Color-Depth Image Pair

Title Bimodal Stereo: Joint Shape and Pose Estimation from Color-Depth Image Pair
Authors Chi Zhang, Yuehu Liu, Ying Wu, Qilin Zhang, Le Wang
Abstract Mutual calibration between color and depth cameras is a challenging topic in multi-modal data registration. In this paper, we are confronted with a “Bimodal Stereo” problem, which aims to solve camera pose from a pair of an uncalibrated color image and a depth map from different views automatically. To address this problem, an iterative Shape-from-Shading (SfS) based framework is proposed to estimate shape and pose simultaneously. In the pipeline, the estimated shape is refined by the shape prior from the given depth map under the estimated pose. Meanwhile, the estimated pose is improved by the registration of estimated shape and shape from given depth map. We also introduce a shading based refinement in the pipeline to address noisy depth map with holes. Extensive experiments showed that through our method, both the depth map, the recovered shape as well as its pose can be desirably refined and recovered.
Tasks Calibration, Pose Estimation
Published 2019-05-16
URL https://arxiv.org/abs/1905.06499v1
PDF https://arxiv.org/pdf/1905.06499v1.pdf
PWC https://paperswithcode.com/paper/bimodal-stereo-joint-shape-and-pose
Repo
Framework

Transferable Neural Processes for Hyperparameter Optimization

Title Transferable Neural Processes for Hyperparameter Optimization
Authors Ying Wei, Peilin Zhao, Huaxiu Yao, Junzhou Huang
Abstract Automated machine learning aims to automate the whole process of machine learning, including model configuration. In this paper, we focus on automated hyperparameter optimization (HPO) based on sequential model-based optimization (SMBO). Though conventional SMBO algorithms work well when abundant HPO trials are available, they are far from satisfactory in practical applications where a trial on a huge dataset may be so costly that an optimal hyperparameter configuration is expected to return in as few trials as possible. Observing that human experts draw on their expertise in a machine learning model by trying configurations that once performed well on other datasets, we are inspired to speed up HPO by transferring knowledge from historical HPO trials on other datasets. We propose an end-to-end and efficient HPO algorithm named as Transfer Neural Processes (TNP), which achieves transfer learning by incorporating trials on other datasets, initializing the model with well-generalized parameters, and learning an initial set of hyperparameters to evaluate. Experiments on extensive OpenML datasets and three computer vision datasets show that the proposed model can achieve state-of-the-art performance in at least one order of magnitude less trials.
Tasks Hyperparameter Optimization, Transfer Learning
Published 2019-09-07
URL https://arxiv.org/abs/1909.03209v2
PDF https://arxiv.org/pdf/1909.03209v2.pdf
PWC https://paperswithcode.com/paper/transferable-neural-processes-for
Repo
Framework

A Generative Model for Sampling High-Performance and Diverse Weights for Neural Networks

Title A Generative Model for Sampling High-Performance and Diverse Weights for Neural Networks
Authors Lior Deutsch, Erik Nijkamp, Yu Yang
Abstract Recent work on mode connectivity in the loss landscape of deep neural networks has demonstrated that the locus of (sub-)optimal weight vectors lies on continuous paths. In this work, we train a neural network that serves as a hypernetwork, mapping a latent vector into high-performance (low-loss) weight vectors, generalizing recent findings of mode connectivity to higher dimensional manifolds. We formulate the training objective as a compromise between accuracy and diversity, where the diversity takes into account trivial symmetry transformations of the target network. We demonstrate how to reduce the number of parameters in the hypernetwork by parameter sharing. Once learned, the hypernetwork allows for a computationally efficient, ancestral sampling of neural network weights, which we recruit to form large ensembles. The improvement in classification accuracy obtained by this ensembling indicates that the generated manifold extends in dimensions other than directions implied by trivial symmetries. For computational efficiency, we distill an ensemble into a single classifier while retaining generalization.
Tasks
Published 2019-05-07
URL https://arxiv.org/abs/1905.02898v1
PDF https://arxiv.org/pdf/1905.02898v1.pdf
PWC https://paperswithcode.com/paper/a-generative-model-for-sampling-high
Repo
Framework

Enabling hyperparameter optimization in sequential autoencoders for spiking neural data

Title Enabling hyperparameter optimization in sequential autoencoders for spiking neural data
Authors Mohammad Reza Keshtkaran, Chethan Pandarinath
Abstract Continuing advances in neural interfaces have enabled simultaneous monitoring of spiking activity from hundreds to thousands of neurons. To interpret these large-scale data, several methods have been proposed to infer latent dynamic structure from high-dimensional datasets. One recent line of work uses recurrent neural networks in a sequential autoencoder (SAE) framework to uncover dynamics. SAEs are an appealing option for modeling nonlinear dynamical systems, and enable a precise link between neural activity and behavior on a single-trial basis. However, the very large parameter count and complexity of SAEs relative to other models has caused concern that SAEs may only perform well on very large training sets. We hypothesized that with a method to systematically optimize hyperparameters (HPs), SAEs might perform well even in cases of limited training data. Such a breakthrough would greatly extend their applicability. However, we find that SAEs applied to spiking neural data are prone to a particular form of overfitting that cannot be detected using standard validation metrics, which prevents standard HP searches. We develop and test two potential solutions: an alternate validation method (“sample validation”) and a novel regularization method (“coordinated dropout”). These innovations prevent overfitting quite effectively, and allow us to test whether SAEs can achieve good performance on limited data through large-scale HP optimization. When applied to data from motor cortex recorded while monkeys made reaches in various directions, large-scale HP optimization allowed SAEs to better maintain performance for small dataset sizes. Our results should greatly extend the applicability of SAEs in extracting latent dynamics from sparse, multidimensional data, such as neural population spiking activity.
Tasks Hyperparameter Optimization
Published 2019-08-21
URL https://arxiv.org/abs/1908.07896v2
PDF https://arxiv.org/pdf/1908.07896v2.pdf
PWC https://paperswithcode.com/paper/190807896
Repo
Framework

Exploring the Robustness of NMT Systems to Nonsensical Inputs

Title Exploring the Robustness of NMT Systems to Nonsensical Inputs
Authors Akshay Chaturvedi, Abijith KP, Utpal Garain
Abstract Neural machine translation (NMT) systems have been shown to give undesirable translation when a small change is made in the source sentence. In this paper, we study the behaviour of NMT systems when multiple changes are made to the source sentence. In particular, we ask the following question “Is it possible for an NMT system to predict same translation even when multiple words in the source sentence have been replaced?". To this end, we propose a soft-attention based technique to make the aforementioned word replacements. The experiments are conducted on two language pairs: English-German (en-de) and English-French (en-fr) and two state-of-the-art NMT systems: BLSTM-based encoder-decoder with attention and Transformer. The proposed soft-attention based technique achieves high success rate and outperforms existing methods like HotFlip by a significant margin for all the conducted experiments. The results demonstrate that state-of-the-art NMT systems are unable to capture the semantics of the source language. The proposed soft-attention based technique is an invariance-based adversarial attack on NMT systems. To better evaluate such attacks, we propose an alternate metric and argue its benefits in comparison with success rate.
Tasks Adversarial Attack, Machine Translation
Published 2019-08-03
URL https://arxiv.org/abs/1908.01165v3
PDF https://arxiv.org/pdf/1908.01165v3.pdf
PWC https://paperswithcode.com/paper/invariance-based-adversarial-attack-on-neural
Repo
Framework

Non-Parametric Priors For Generative Adversarial Networks

Title Non-Parametric Priors For Generative Adversarial Networks
Authors Rajhans Singh, Pavan Turaga, Suren Jayasuriya, Ravi Garg, Martin W. Braun
Abstract The advent of generative adversarial networks (GAN) has enabled new capabilities in synthesis, interpolation, and data augmentation heretofore considered very challenging. However, one of the common assumptions in most GAN architectures is the assumption of simple parametric latent-space distributions. While easy to implement, a simple latent-space distribution can be problematic for uses such as interpolation. This is due to distributional mismatches when samples are interpolated in the latent space. We present a straightforward formalization of this problem; using basic results from probability theory and off-the-shelf-optimization tools, we develop ways to arrive at appropriate non-parametric priors. The obtained prior exhibits unusual qualitative properties in terms of its shape, and quantitative benefits in terms of lower divergence with its mid-point distribution. We demonstrate that our designed prior helps improve image generation along any Euclidean straight line during interpolation, both qualitatively and quantitatively, without any additional training or architectural modifications. The proposed formulation is quite flexible, paving the way to impose newer constraints on the latent-space statistics.
Tasks Data Augmentation, Image Generation
Published 2019-05-16
URL https://arxiv.org/abs/1905.07061v1
PDF https://arxiv.org/pdf/1905.07061v1.pdf
PWC https://paperswithcode.com/paper/non-parametric-priors-for-generative
Repo
Framework

A Deep Learning Based Attack for The Chaos-based Image Encryption

Title A Deep Learning Based Attack for The Chaos-based Image Encryption
Authors Chen He, Kan Ming, Yongwei Wang, Z. Jane Wang
Abstract In this letter, as a proof of concept, we propose a deep learning-based approach to attack the chaos-based image encryption algorithm in \cite{guan2005chaos}. The proposed method first projects the chaos-based encrypted images into the low-dimensional feature space, where essential information of plain images has been largely preserved. With the low-dimensional features, a deconvolutional generator is utilized to regenerate perceptually similar decrypted images to approximate the plain images in the high-dimensional space. Compared with conventional image encryption attack algorithms, the proposed method does not require to manually analyze and infer keys in a time-consuming way. Instead, we directly attack the chaos-based encryption algorithms in a key-independent manner. Moreover, the proposed method can be trained end-to-end. Given the chaos-based encrypted images, a well-trained decryption model is able to automatically reconstruct plain images with high fidelity. In the experiments, we successfully attack the chaos-based algorithm \cite{guan2005chaos} and the decrypted images are visually similar to their ground truth plain images. Experimental results on both static-key and dynamic-key scenarios verify the efficacy of the proposed method.
Tasks
Published 2019-07-29
URL https://arxiv.org/abs/1907.12245v1
PDF https://arxiv.org/pdf/1907.12245v1.pdf
PWC https://paperswithcode.com/paper/a-deep-learning-based-attack-for-the-chaos
Repo
Framework

Towards Assessing the Impact of Bayesian Optimization’s Own Hyperparameters

Title Towards Assessing the Impact of Bayesian Optimization’s Own Hyperparameters
Authors Marius Lindauer, Matthias Feurer, Katharina Eggensperger, André Biedenkapp, Frank Hutter
Abstract Bayesian Optimization (BO) is a common approach for hyperparameter optimization (HPO) in automated machine learning. Although it is well-accepted that HPO is crucial to obtain well-performing machine learning models, tuning BO’s own hyperparameters is often neglected. In this paper, we empirically study the impact of optimizing BO’s own hyperparameters and the transferability of the found settings using a wide range of benchmarks, including artificial functions, HPO and HPO combined with neural architecture search. In particular, we show (i) that tuning can improve the any-time performance of different BO approaches, that optimized BO settings also perform well (ii) on similar problems and (iii) partially even on problems from other problem families, and (iv) which BO hyperparameters are most important.
Tasks Hyperparameter Optimization, Neural Architecture Search
Published 2019-08-19
URL https://arxiv.org/abs/1908.06674v1
PDF https://arxiv.org/pdf/1908.06674v1.pdf
PWC https://paperswithcode.com/paper/towards-assessing-the-impact-of-bayesian
Repo
Framework

Improving End-to-End Sequential Recommendations with Intent-aware Diversification

Title Improving End-to-End Sequential Recommendations with Intent-aware Diversification
Authors Wanyu Chen, Pengjie Ren, Fei Cai, Maarten de Rijke
Abstract Sequential Recommendation (SRs) that capture users’ dynamic intents by modeling user sequential behaviors can recommend closely accurate products to users. Previous work on SRs is mostly focused on optimizing the recommendation accuracy, often ignoring the recommendation diversity, even though it is an important criterion for evaluating the recommendation performance. Most existing methods for improving the diversity of recommendations are not ideally applicable for SRs because they assume that user intents are static and rely on post-processing the list of recommendations to promote diversity. We consider both recommendation accuracy and diversity for SRs by proposing an end-to-end neural model, called Intent-aware Diversified Sequential Recommendation (IDSR). Specifically, we introduce an Implicit Intent Mining module (IIM) into SRs to capture different user intents reflected in user behavior sequences. Then, we design an Intent-aware Diversity Promoting (IDP) loss to supervise the learning of the IIM module and force the model to take recommendation diversity into consideration during training. Extensive experiments on two benchmark datasets show that IDSR significantly outperforms state-of-the-art methods in terms of recommendation diversity while yielding comparable or superior recommendation accuracy.
Tasks
Published 2019-08-27
URL https://arxiv.org/abs/1908.10171v1
PDF https://arxiv.org/pdf/1908.10171v1.pdf
PWC https://paperswithcode.com/paper/improving-end-to-end-sequential
Repo
Framework

Spherical sampling methods for the calculation of metamer mismatch volumes

Title Spherical sampling methods for the calculation of metamer mismatch volumes
Authors Michal Mackiewicz, Hans Jakob Rivertz, Graham D. Finlayson
Abstract In this paper, we propose two methods of calculating theoretically maximal metamer mismatch volumes. Unlike prior art techniques, our methods do not make any assumptions on the shape of spectra on the boundary of the mismatch volumes. Both methods utilize a spherical sampling approach, but they calculate mismatch volumes in two different ways. The first method uses a linear programming optimization, while the second is a computational geometry approach based on half-space intersection. We show that under certain conditions the theoretically maximal metamer mismatch volume is significantly larger than the one approximated using a prior art method.
Tasks
Published 2019-01-23
URL http://arxiv.org/abs/1901.08419v1
PDF http://arxiv.org/pdf/1901.08419v1.pdf
PWC https://paperswithcode.com/paper/spherical-sampling-methods-for-the
Repo
Framework

Speech-to-speech Translation between Untranscribed Unknown Languages

Title Speech-to-speech Translation between Untranscribed Unknown Languages
Authors Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
Abstract In this paper, we explore a method for training speech-to-speech translation tasks without any transcription or linguistic supervision. Our proposed method consists of two steps: First, we train and generate discrete representation with unsupervised term discovery with a discrete quantized autoencoder. Second, we train a sequence-to-sequence model that directly maps the source language speech to the target language’s discrete representation. Our proposed method can directly generate target speech without any auxiliary or pre-training steps with a source or target transcription. To the best of our knowledge, this is the first work that performed pure speech-to-speech translation between untranscribed unknown languages.
Tasks
Published 2019-10-02
URL https://arxiv.org/abs/1910.00795v2
PDF https://arxiv.org/pdf/1910.00795v2.pdf
PWC https://paperswithcode.com/paper/speech-to-speech-translation-between
Repo
Framework
comments powered by Disqus