January 29, 2020

3069 words 15 mins read

Paper Group ANR 573

A Method for Evaluating Chimeric Synchronization of Coupled Oscillators and Its Application for Creating a Neural Network Information Converter. Machine Learning for Clinical Predictive Analytics. Learning performance in inverse Ising problems with sparse teacher couplings. Meta-Learning Mean Functions for Gaussian Processes. Multilingual Culture-I …

A Method for Evaluating Chimeric Synchronization of Coupled Oscillators and Its Application for Creating a Neural Network Information Converter


Title	A Method for Evaluating Chimeric Synchronization of Coupled Oscillators and Its Application for Creating a Neural Network Information Converter
Authors	Andrei Velichko
Abstract	This paper presents a new method for evaluating the synchronization of quasi-periodic oscillations of two oscillators, termed “chimeric synchronization”. The family of metrics is proposed to create a neural network information converter based on a network of pulsed oscillators. In addition to transforming input information from digital to analogue, the converter can perform information processing after training the network by selecting control parameters. In the proposed neural network scheme, the data arrives at the input layer in the form of current levels of the oscillators and is converted into a set of non-repeating states of the chimeric synchronization of the output oscillator. By modelling a thermally coupled VO2-oscillator circuit, the network setup is demonstrated through the selection of coupling strength, power supply levels, and the synchronization efficiency parameter. The distribution of solutions depending on the operating mode of the oscillators, sub-threshold mode, or generation mode are revealed. Technological approaches for the implementation of a neural network information converter are proposed, and examples of its application for image filtering are demonstrated. The proposed method helps to significantly expand the capabilities of neuromorphic and logical devices based on synchronization effects.
Tasks
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02680v2
PDF	https://arxiv.org/pdf/1906.02680v2.pdf
PWC	https://paperswithcode.com/paper/a-method-for-the-classification-of-chimera
Repo
Framework

Machine Learning for Clinical Predictive Analytics


Title	Machine Learning for Clinical Predictive Analytics
Authors	Wei-Hung Weng
Abstract	In this chapter, we provide a brief overview of applying machine learning techniques for clinical prediction tasks. We begin with a quick introduction to the concepts of machine learning and outline some of the most common machine learning algorithms. Next, we demonstrate how to apply the algorithms with appropriate toolkits to conduct machine learning experiments for clinical prediction tasks. The objectives of this chapter are to (1) understand the basics of machine learning techniques and the reasons behind why they are useful for solving clinical prediction problems, (2) understand the intuition behind some machine learning models, including regression, decision trees, and support vector machines, and (3) understand how to apply these models to clinical prediction problems using publicly available datasets via case studies.
Tasks
Published	2019-09-19
URL	https://arxiv.org/abs/1909.09246v1
PDF	https://arxiv.org/pdf/1909.09246v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-for-clinical-predictive
Repo
Framework

Learning performance in inverse Ising problems with sparse teacher couplings


Title	Learning performance in inverse Ising problems with sparse teacher couplings
Authors	Alia Abbara, Yoshiyuki Kabashima, Tomoyuki Obuchi, Yingying Xu
Abstract	We investigate the learning performance of the pseudolikelihood maximization method for inverse Ising problems. In the teacher-student scenario under the assumption that the teacher’s couplings are sparse and the student does not know the graphical structure, the learning curve and order parameters are assessed in the typical case using the replica and cavity methods from statistical mechanics. Our formulation is also applicable to a certain class of cost functions having locality; the standard likelihood does not belong to that class. The derived analytical formulas indicate that the perfect inference of the presence/absence of the teacher’s couplings is possible in the thermodynamic limit taking the number of spins $N$ as infinity while keeping the dataset size $M$ proportional to $N$, as long as $\alpha=M/N > 2$. Meanwhile, the formulas also show that the estimated coupling values corresponding to the truly existing ones in the teacher tend to be overestimated in the absolute value, manifesting the presence of estimation bias. These results are considered to be exact in the thermodynamic limit on locally tree-like networks, such as the regular random or Erd\H{o}s–R'enyi graphs. Numerical simulation results fully support the theoretical predictions. Additional biases in the estimators on loopy graphs are also discussed.
Tasks
Published	2019-12-25
URL	https://arxiv.org/abs/1912.11591v1
PDF	https://arxiv.org/pdf/1912.11591v1.pdf
PWC	https://paperswithcode.com/paper/learning-performance-in-inverse-ising
Repo
Framework

Meta-Learning Mean Functions for Gaussian Processes


Title	Meta-Learning Mean Functions for Gaussian Processes
Authors	Vincent Fortuin, Heiko Strathmann, Gunnar Rätsch
Abstract	When fitting Bayesian machine learning models on scarce data, the main challenge is to obtain suitable prior knowledge and encode it into the model. Recent advances in meta-learning offer powerful methods for extracting such prior knowledge from data acquired in related tasks. When it comes to meta-learning in Gaussian process models, approaches in this setting have mostly focused on learning the kernel function of the prior, but not on learning its mean function. In this work, we explore meta-learning the mean function of a Gaussian process prior. We present analytical and empirical evidence that mean function learning can be useful in the meta-learning setting, discuss the risk of overfitting, and draw connections to other meta-learning approaches, such as model agnostic meta-learning and functional PCA.
Tasks	Gaussian Processes, Meta-Learning
Published	2019-01-23
URL	https://arxiv.org/abs/1901.08098v4
PDF	https://arxiv.org/pdf/1901.08098v4.pdf
PWC	https://paperswithcode.com/paper/deep-mean-functions-for-meta-learning-in
Repo
Framework

Multilingual Culture-Independent Word Analogy Datasets


Title	Multilingual Culture-Independent Word Analogy Datasets
Authors	Matej Ulčar, Kristiina Vaik, Jessica Lindström, Milda Dailidėnaitė, Marko Robnik-Šikonja
Abstract	In text processing, deep neural networks mostly use word embeddings as an input. Embeddings have to ensure that relations between words are reflected through distances in a high-dimensional numeric space. To compare the quality of different text embeddings, typically, we use benchmark datasets. We present a collection of such datasets for the word analogy task in nine languages: Croatian, English, Estonian, Finnish, Latvian, Lithuanian, Russian, Slovenian, and Swedish. We redesigned the original monolingual analogy task to be much more culturally independent and also constructed cross-lingual analogy datasets for the involved languages. We present basic statistics of the created datasets and their initial evaluation using fastText embeddings.
Tasks	Word Embeddings
Published	2019-11-22
URL	https://arxiv.org/abs/1911.10038v2
PDF	https://arxiv.org/pdf/1911.10038v2.pdf
PWC	https://paperswithcode.com/paper/multilingual-culture-independent-word-analogy
Repo
Framework

ReachNN: Reachability Analysis of Neural-Network Controlled Systems


Title	ReachNN: Reachability Analysis of Neural-Network Controlled Systems
Authors	Chao Huang, Jiameng Fan, Wenchao Li, Xin Chen, Qi Zhu
Abstract	Applying neural networks as controllers in dynamical systems has shown great promises. However, it is critical yet challenging to verify the safety of such control systems with neural-network controllers in the loop. Previous methods for verifying neural network controlled systems are limited to a few specific activation functions. In this work, we propose a new reachability analysis approach based on Bernstein polynomials that can verify neural-network controlled systems with a more general form of activation functions, i.e., as long as they ensure that the neural networks are Lipschitz continuous. Specifically, we consider abstracting feedforward neural networks with Bernstein polynomials for a small subset of inputs. To quantify the error introduced by abstraction, we provide both theoretical error bound estimation based on the theory of Bernstein polynomials and more practical sampling based error bound estimation, following a tight Lipschitz constant estimation approach based on forward reachability analysis. Compared with previous methods, our approach addresses a much broader set of neural networks, including heterogeneous neural networks that contain multiple types of activation functions. Experiment results on a variety of benchmarks show the effectiveness of our approach.
Tasks
Published	2019-06-25
URL	https://arxiv.org/abs/1906.10654v1
PDF	https://arxiv.org/pdf/1906.10654v1.pdf
PWC	https://paperswithcode.com/paper/reachnn-reachability-analysis-of-neural
Repo
Framework

AAAI-2019 Workshop on Games and Simulations for Artificial Intelligence


Title	AAAI-2019 Workshop on Games and Simulations for Artificial Intelligence
Authors	Marwan Mattar, Roozbeh Mottaghi, Julian Togelius, Danny Lange
Abstract	This volume represents the accepted submissions from the AAAI-2019 Workshop on Games and Simulations for Artificial Intelligence held on January 29, 2019 in Honolulu, Hawaii, USA. https://www.gamesim.ai
Tasks
Published	2019-03-06
URL	http://arxiv.org/abs/1903.02172v1
PDF	http://arxiv.org/pdf/1903.02172v1.pdf
PWC	https://paperswithcode.com/paper/aaai-2019-workshop-on-games-and-simulations
Repo
Framework

Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval


Title	Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval
Authors	Yan Shuo Tan, Roman Vershynin
Abstract	In recent literature, a general two step procedure has been formulated for solving the problem of phase retrieval. First, a spectral technique is used to obtain a constant-error initial estimate, following which, the estimate is refined to arbitrary precision by first-order optimization of a non-convex loss function. Numerical experiments, however, seem to suggest that simply running the iterative schemes from a random initialization may also lead to convergence, albeit at the cost of slightly higher sample complexity. In this paper, we prove that, in fact, constant step size online stochastic gradient descent (SGD) converges from arbitrary initializations for the non-smooth, non-convex amplitude squared loss objective. In this setting, online SGD is also equivalent to the randomized Kaczmarz algorithm from numerical analysis. Our analysis can easily be generalized to other single index models. It also makes use of new ideas from stochastic process theory, including the notion of a summary state space, which we believe will be of use for the broader field of non-convex optimization.
Tasks
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12837v1
PDF	https://arxiv.org/pdf/1910.12837v1.pdf
PWC	https://paperswithcode.com/paper/online-stochastic-gradient-descent-with
Repo
Framework

Unsupervised Learning of Object Structure and Dynamics from Videos


Title	Unsupervised Learning of Object Structure and Dynamics from Videos
Authors	Matthias Minderer, Chen Sun, Ruben Villegas, Forrester Cole, Kevin Murphy, Honglak Lee
Abstract	Extracting and predicting object structure and dynamics from videos without supervision is a major challenge in machine learning. To address this challenge, we adopt a keypoint-based image representation and learn a stochastic dynamics model of the keypoints. Future frames are reconstructed from the keypoints and a reference frame. By modeling dynamics in the keypoint coordinate space, we achieve stable learning and avoid compounding of errors in pixel space. Our method improves upon unstructured representations both for pixel-level video prediction and for downstream tasks requiring object-level understanding of motion dynamics. We evaluate our model on diverse datasets: a multi-agent sports dataset, the Human3.6M dataset, and datasets based on continuous control tasks from the DeepMind Control Suite. The spatially structured representation outperforms unstructured representations on a range of motion-related tasks such as object tracking, action recognition and reward prediction.
Tasks	Continuous Control, Object Tracking, Video Prediction
Published	2019-06-19
URL	https://arxiv.org/abs/1906.07889v3
PDF	https://arxiv.org/pdf/1906.07889v3.pdf
PWC	https://paperswithcode.com/paper/unsupervised-learning-of-object-structure-and
Repo
Framework

Decision Forest: A Nonparametric Approach to Modeling Irrational Choice


Title	Decision Forest: A Nonparametric Approach to Modeling Irrational Choice
Authors	Yi-Chun Chen, Velibor V. Mišić
Abstract	Customer behavior is often assumed to follow weak rationality, which implies that adding a product to an assortment will not increase the choice probability of another product in that assortment. However, an increasing amount of research has revealed that customers are not necessarily rational when making decisions. In this paper, we study a new nonparametric choice model that relaxes this assumption and can model a wider range of customer behavior, such as decoy effects between products. In this model, each customer type is associated with a binary decision tree, which represents a decision process for making a purchase based on checking for the existence of specific products in the assortment. Together with a probability distribution over customer types, we show that the resulting model – a decision forest – is able to represent any customer choice model, including models that are inconsistent with weak rationality. We theoretically characterize the depth of the forest needed to fit a data set of historical assortments and prove that asymptotically, a forest whose depth scales logarithmically in the number of assortments is sufficient to fit most data sets. We also propose an efficient algorithm for estimating such models from data, based on combining randomization and optimization. Using synthetic data and real transaction data exhibiting non-rational behavior, we show that the model outperforms the multinomial logit and ranking-based models in out-of-sample predictive ability.
Tasks
Published	2019-04-25
URL	http://arxiv.org/abs/1904.11532v1
PDF	http://arxiv.org/pdf/1904.11532v1.pdf
PWC	https://paperswithcode.com/paper/decision-forest-a-nonparametric-approach-to
Repo
Framework

Stochastic algorithms with geometric step decay converge linearly on sharp functions


Title	Stochastic algorithms with geometric step decay converge linearly on sharp functions
Authors	Damek Davis, Dmitriy Drusvyatskiy, Vasileios Charisopoulos
Abstract	Stochastic (sub)gradient methods require step size schedule tuning to perform well in practice. Classical tuning strategies decay the step size polynomially and lead to optimal sublinear rates on (strongly) convex problems. An alternative schedule, popular in nonconvex optimization, is called \emph{geometric step decay} and proceeds by halving the step size after every few epochs. In recent work, geometric step decay was shown to improve exponentially upon classical sublinear rates for the class of \emph{sharp} convex functions. In this work, we ask whether geometric step decay similarly improves stochastic algorithms for the class of sharp nonconvex problems. Such losses feature in modern statistical recovery problems and lead to a new challenge not present in the convex setting: the region of convergence is local, so one must bound the probability of escape. Our main result shows that for a large class of stochastic, sharp, nonsmooth, and nonconvex problems a geometric step decay schedule endows well-known algorithms with a local linear rate of convergence to global minimizers. This guarantee applies to the stochastic projected subgradient, proximal point, and prox-linear algorithms. As an application of our main result, we analyze two statistical recovery tasks—phase retrieval and blind deconvolution—and match the best known guarantees under Gaussian measurement models and establish new guarantees under heavy-tailed distributions.
Tasks
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09547v1
PDF	https://arxiv.org/pdf/1907.09547v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-algorithms-with-geometric-step
Repo
Framework

Approximating Gaussian Process Emulators with Linear Inequality Constraints and Noisy Observations via MC and MCMC


Title	Approximating Gaussian Process Emulators with Linear Inequality Constraints and Noisy Observations via MC and MCMC
Authors	Andrés F. López-Lopera, François Bachoc, Nicolas Durrande, Jérémy Rohmer, Déborah Idier, Olivier Roustant
Abstract	Adding inequality constraints (e.g. boundedness, monotonicity, convexity) into Gaussian processes (GPs) can lead to more realistic stochastic emulators. Due to the truncated Gaussianity of the posterior, its distribution has to be approximated. In this work, we consider Monte Carlo (MC) and Markov Chain Monte Carlo (MCMC) methods. However, strictly interpolating the observations may entail expensive computations due to highly restrictive sample spaces. Furthermore, having (constrained) GP emulators when data are actually noisy is also of interest for real-world implementations. Hence, we introduce a noise term for the relaxation of the interpolation conditions, and we develop the corresponding approximation of GP emulators under linear inequality constraints. We show with various toy examples that the performance of MC and MCMC samplers improves when considering noisy observations. Finally, on 2D and 5D coastal flooding applications, we show that more flexible and realistic GP implementations can be obtained by considering noise effects and by enforcing the (linear) inequality constraints.
Tasks	Gaussian Processes
Published	2019-01-15
URL	https://arxiv.org/abs/1901.04827v2
PDF	https://arxiv.org/pdf/1901.04827v2.pdf
PWC	https://paperswithcode.com/paper/approximating-gaussian-process-emulators-with
Repo
Framework

Multi-Granularity Fusion Network for Proposal and Activity Localization: Submission to ActivityNet Challenge 2019 Task 1 and Task 2


Title	Multi-Granularity Fusion Network for Proposal and Activity Localization: Submission to ActivityNet Challenge 2019 Task 1 and Task 2
Authors	Haisheng Su, Xu Zhao, Shuming Liu
Abstract	This technical report presents an overview of our solution used in the submission to ActivityNet Challenge 2019 Task 1 (\textbf{temporal action proposal generation}) and Task 2 (\textbf{temporal action localization/detection}). Temporal action proposal indicates the temporal intervals containing the actions and plays an important role in temporal action localization. Top-down and bottom-up methods are the two main categories used for proposal generation in the existing literature. In this paper, we devise a novel Multi-Granularity Fusion Network (MGFN) to combine the proposals generated from different frameworks for complementary filtering and confidence re-ranking. Specifically, we consider the diversity comprehensively from multiple perspectives, e.g. the characteristic aspect, the data aspect, the model aspect and the result aspect. Our MGFN achieves the state-of-the-art performance on the temporal action proposal task with 69.85 AUC score and the temporal action localization task with 38.90 mAP on the challenge testing set.
Tasks	Action Localization, Temporal Action Localization, Temporal Action Proposal Generation
Published	2019-07-29
URL	https://arxiv.org/abs/1907.12223v1
PDF	https://arxiv.org/pdf/1907.12223v1.pdf
PWC	https://paperswithcode.com/paper/multi-granularity-fusion-network-for-proposal
Repo
Framework

Investigation on Combining 3D Convolution of Image Data and Optical Flow to Generate Temporal Action Proposals


Title	Investigation on Combining 3D Convolution of Image Data and Optical Flow to Generate Temporal Action Proposals
Authors	Patrick Schlosser, David Münch, Michael Arens
Abstract	In this paper, several variants of two-stream architectures for temporal action proposal generation in long, untrimmed videos are presented. Inspired by the recent advances in the field of human action recognition utilizing 3D convolutions in combination with two-stream networks and based on the Single-Stream Temporal Action Proposals (SST) architecture, four different two-stream architectures utilizing sequences of images on one stream and sequences of images of optical flow on the other stream are subsequently investigated. The four architectures fuse the two separate streams at different depths in the model; for each of them, a broad range of parameters is investigated systematically as well as an optimal parametrization is empirically determined. The experiments on the THUMOS’14 dataset show that all four two-stream architectures are able to outperform the original single-stream SST and achieve state of the art results. Additional experiments revealed that the improvements are not restricted to a single method of calculating optical flow by exchanging the formerly used method of Brox with FlowNet2 and still achieving improvements.
Tasks	Optical Flow Estimation, Temporal Action Localization, Temporal Action Proposal Generation
Published	2019-03-11
URL	http://arxiv.org/abs/1903.04176v2
PDF	http://arxiv.org/pdf/1903.04176v2.pdf
PWC	https://paperswithcode.com/paper/investigation-on-combining-3d-convolution-of
Repo
Framework

Speaker Adaptation for Attention-Based End-to-End Speech Recognition


Title	Speaker Adaptation for Attention-Based End-to-End Speech Recognition
Authors	Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong
Abstract	We propose three regularization-based speaker adaptation approaches to adapt the attention-based encoder-decoder (AED) model with very limited adaptation data from target speakers for end-to-end automatic speech recognition. The first method is Kullback-Leibler divergence (KLD) regularization, in which the output distribution of a speaker-dependent (SD) AED is forced to be close to that of the speaker-independent (SI) model by adding a KLD regularization to the adaptation criterion. To compensate for the asymmetric deficiency in KLD regularization, an adversarial speaker adaptation (ASA) method is proposed to regularize the deep-feature distribution of the SD AED through the adversarial learning of an auxiliary discriminator and the SD AED. The third approach is the multi-task learning, in which an SD AED is trained to jointly perform the primary task of predicting a large number of output units and an auxiliary task of predicting a small number of output units to alleviate the target sparsity issue. Evaluated on a Microsoft short message dictation task, all three methods are highly effective in adapting the AED model, achieving up to 12.2% and 3.0% word error rate improvement over an SI AED trained from 3400 hours data for supervised and unsupervised adaptation, respectively.
Tasks	End-To-End Speech Recognition, Multi-Task Learning, Speech Recognition
Published	2019-11-09
URL	https://arxiv.org/abs/1911.03762v1
PDF	https://arxiv.org/pdf/1911.03762v1.pdf
PWC	https://paperswithcode.com/paper/speaker-adaptation-for-attention-based-end-to
Repo
Framework