April 1, 2020

3319 words 16 mins read

Paper Group ANR 416

GAST-Net: Graph Attention Spatio-temporal Convolutional Networks for 3D Human Pose Estimation in Video. Algorithms for Non-Stationary Generalized Linear Bandits. FedLoc: Federated Learning Framework for Cooperative Localization and Location Data Processing. Learning the Hypotheses Space from data Part II: Convergence and Feasibility. Data-Driven Pe …

GAST-Net: Graph Attention Spatio-temporal Convolutional Networks for 3D Human Pose Estimation in Video


Title	GAST-Net: Graph Attention Spatio-temporal Convolutional Networks for 3D Human Pose Estimation in Video
Authors	Junfa Liu, Yisheng Guang, Juan Rojas
Abstract	3D pose estimation in video can benefit greatly from both temporal and spatial information. Occlusions and depth ambiguities remain outstanding problems. In this work, we study how to learn the kinematic constraints of the human skeleton by modeling additional spatial information through attention and interleaving it in a synergistic way with temporal models. We contribute a graph attention spatio-temporal convolutional network (GAST-Net) that makes full use of spatio-temporal information and mitigates the problems of occlusion and depth ambiguities. We also contribute attention mechanisms that learn inter-joint relations that are easily visualizable. GAST-Net comprises of interleaved temporal convolutional and graph attention blocks. We use dilated temporal convolution networks (TCNs) to model long-term patterns. More critically, graph attention blocks encode local and global representations through novel convolutional kernels that express human skeletal symmetrical structure and adaptively extract global semantics over time. GAST-Net outperforms SOTA by approximately 10% for mean per-joint position error for ground-truth labels on Human3.6M and achieves competitive results on HumanEva-I.
Tasks	3D Human Pose Estimation, 3D Pose Estimation, Pose Estimation
Published	2020-03-11
URL	https://arxiv.org/abs/2003.14179v1
PDF	https://arxiv.org/pdf/2003.14179v1.pdf
PWC	https://paperswithcode.com/paper/gast-net-graph-attention-spatio-temporal
Repo
Framework

Algorithms for Non-Stationary Generalized Linear Bandits


Title	Algorithms for Non-Stationary Generalized Linear Bandits
Authors	Yoan Russac, Olivier Cappé, Aurélien Garivier
Abstract	The statistical framework of Generalized Linear Models (GLM) can be applied to sequential problems involving categorical or ordinal rewards associated, for instance, with clicks, likes or ratings. In the example of binary rewards, logistic regression is well-known to be preferable to the use of standard linear modeling. Previous works have shown how to deal with GLMs in contextual online learning with bandit feedback when the environment is assumed to be stationary. In this paper, we relax this latter assumption and propose two upper confidence bound based algorithms that make use of either a sliding window or a discounted maximum-likelihood estimator. We provide theoretical guarantees on the behavior of these algorithms for general context sequences and in the presence of abrupt changes. These results take the form of high probability upper bounds for the dynamic regret that are of order d^2/3 G^1/3 T^2/3 , where d, T and G are respectively the dimension of the unknown parameter, the number of rounds and the number of breakpoints up to time T. The empirical performance of the algorithms is illustrated in simulated environments.
Tasks
Published	2020-03-23
URL	https://arxiv.org/abs/2003.10113v1
PDF	https://arxiv.org/pdf/2003.10113v1.pdf
PWC	https://paperswithcode.com/paper/algorithms-for-non-stationary-generalized
Repo
Framework

FedLoc: Federated Learning Framework for Cooperative Localization and Location Data Processing


Title	FedLoc: Federated Learning Framework for Cooperative Localization and Location Data Processing
Authors	Feng Yin, Zhidi Lin, Yue Xu, Qinglei Kong, Deshi Li, Sergios Theodoridis, Shuguang, Cui
Abstract	In this paper, we propose a new localization framework in which mobile users or smart agents can cooperate to build accurate location services without sacrificing privacy, in particular, information related to their trajectories. The proposed framework is called Federated Localization (FedLoc), simply because it adopts the recently proposed federated learning. Apart from the new FedLoc framework, this paper can be deemed as an overview paper, in which we review the state-of-the-art federated learning framework, two widely used learning models, various distributed model hyper-parameter optimization schemes, and some practical use cases that fall under the FedLoc framework. The use cases, summarized from a mixture of standard, recently published, and unpublished works, cover a broad range of location services, including collaborative static localization/fingerprinting, indoor target tracking, outdoor navigation using low-sampling GPS, and spatio-temporal wireless traffic data modeling and prediction. The obtained primary results confirm that the proposed FedLoc framework well suits data-driven, machine learning-based localization and spatio-temporal data modeling. Future research directions are discussed at the end of this paper.
Tasks
Published	2020-03-08
URL	https://arxiv.org/abs/2003.03697v1
PDF	https://arxiv.org/pdf/2003.03697v1.pdf
PWC	https://paperswithcode.com/paper/fedloc-federated-learning-framework-for
Repo
Framework

Learning the Hypotheses Space from data Part II: Convergence and Feasibility


Title	Learning the Hypotheses Space from data Part II: Convergence and Feasibility
Authors	Diego Marcondes, Adilson Simonis, Junior Barrera
Abstract	In part \textit{I} we proposed a structure for a general Hypotheses Space $\mathcal{H}$, the Learning Space $\mathbb{L}(\mathcal{H})$, which can be employed to avoid \textit{overfitting} when estimating in a complex space with relative shortage of examples. Also, we presented the U-curve property, which can be taken advantage of in order to select a Hypotheses Space without exhaustively searching $\mathbb{L}(\mathcal{H})$. In this paper, we carry further our agenda, by showing the consistency of a model selection framework based on Learning Spaces, in which one selects from data the Hypotheses Space on which to learn. The method developed in this paper adds to the state-of-the-art in model selection, by extending Vapnik-Chervonenkis Theory to \textit{random} Hypotheses Spaces, i.e., Hypotheses Spaces learned from data. In this framework, one estimates a random subspace $\hat{\mathcal{M}} \in \mathbb{L}(\mathcal{H})$ which converges with probability one to a target Hypotheses Space $\mathcal{M}^{\star} \in \mathbb{L}(\mathcal{H})$ with desired properties. As the convergence implies asymptotic unbiased estimators, we have a consistent framework for model selection, showing that it is feasible to learn the Hypotheses Space from data. Furthermore, we show that the generalization errors of learning on $\hat{\mathcal{M}}$ are lesser than those we commit when learning on $\mathcal{H}$, so it is more efficient to learn on a subspace learned from data.
Tasks	Model Selection
Published	2020-01-30
URL	https://arxiv.org/abs/2001.11578v1
PDF	https://arxiv.org/pdf/2001.11578v1.pdf
PWC	https://paperswithcode.com/paper/learning-the-hypotheses-space-from-data-part-1
Repo
Framework

Data-Driven Permanent Magnet Temperature Estimation in Synchronous Motors with Supervised Machine Learning


Title	Data-Driven Permanent Magnet Temperature Estimation in Synchronous Motors with Supervised Machine Learning
Authors	Wilhelm Kirchgässner, Oliver Wallscheid, Joachim Böcker
Abstract	Monitoring the magnet temperature in permanent magnet synchronous motors (PMSMs) for automotive applications is a challenging task for several decades now, as signal injection or sensor-based methods still prove unfeasible in a commercial context. Overheating results in severe motor deterioration and is thus of high concern for the machine’s control strategy and its design. Lack of precise temperature estimations leads to lesser device utilization and higher material cost. In this work, several machine learning (ML) models are empirically evaluated on their estimation accuracy for the task of predicting latent high-dynamic magnet temperature profiles. The range of selected algorithms covers as diverse approaches as possible with ordinary and weighted least squares, support vector regression, $k$-nearest neighbors, randomized trees and neural networks. Having test bench data available, it is shown that ML approaches relying merely on collected data meet the estimation performance of classical thermal models built on thermodynamic theory, yet not all kinds of models render efficient use of large datasets or sufficient modeling capacities. Especially linear regression and simple feed-forward neural networks with optimized hyperparameters mark strong predictive quality at low to moderate model sizes.
Tasks
Published	2020-01-17
URL	https://arxiv.org/abs/2001.06246v1
PDF	https://arxiv.org/pdf/2001.06246v1.pdf
PWC	https://paperswithcode.com/paper/data-driven-permanent-magnet-temperature
Repo
Framework

Complete Endomorphisms in Computer Vision


Title	Complete Endomorphisms in Computer Vision
Authors	Javier Finat, Francisco Delgado-del-Hoyo
Abstract	Correspondences between k-tuples of points are key in multiple view geometry and motion analysis. Regular transformations are posed by homographies between two projective planes that serves as structural models for images. Such transformations can not include degenerate situations. Fundamental or essential matrices expand homographies with structural information by using degenerate bilinear maps. The projectivization of the endomorphisms of a three-dimensional vector space includes all of them. Hence, they are able to explain a wider range of eventually degenerate transformations between arbitrary pairs of views. To include these degenerate situations, this paper introduces a completion of bilinear maps between spaces given by an equivariant compactification of regular transformations. This completion is extensible to the varieties of fundamental and essential matrices, where most methods based on regular transformations fail. The construction of complete endomorphisms manages degenerate projection maps using a simultaneous action on source and target spaces. In such way, this mathematical construction provides a robust framework to relate corresponding views in multiple view geometry.
Tasks
Published	2020-02-20
URL	https://arxiv.org/abs/2002.09003v1
PDF	https://arxiv.org/pdf/2002.09003v1.pdf
PWC	https://paperswithcode.com/paper/complete-endomorphisms-in-computer-vision
Repo
Framework

Variational Autoencoders with Riemannian Brownian Motion Priors


Title	Variational Autoencoders with Riemannian Brownian Motion Priors
Authors	Dimitris Kalatzis, David Eklund, Georgios Arvanitidis, Søren Hauberg
Abstract	Variational Autoencoders (VAEs) represent the given data in a low-dimensional latent space, which is generally assumed to be Euclidean. This assumption naturally leads to the common choice of a standard Gaussian prior over continuous latent variables. Recent work has, however, shown that this prior has a detrimental effect on model capacity, leading to subpar performance. We propose that the Euclidean assumption lies at the heart of this failure mode. To counter this, we assume a Riemannian structure over the latent space, which constitutes a more principled geometric view of the latent codes, and replace the standard Gaussian prior with a Riemannian Brownian motion prior. We propose an efficient inference scheme that does not rely on the unknown normalizing factor of this prior. Finally, we demonstrate that this prior significantly increases model capacity using only one additional scalar parameter.
Tasks
Published	2020-02-12
URL	https://arxiv.org/abs/2002.05227v2
PDF	https://arxiv.org/pdf/2002.05227v2.pdf
PWC	https://paperswithcode.com/paper/variational-autoencoders-with-riemannian
Repo
Framework

Weighted Random Search for CNN Hyperparameter Optimization


Title	Weighted Random Search for CNN Hyperparameter Optimization
Authors	Razvan Andonie, Adrian-Catalin Florea
Abstract	Nearly all model algorithms used in machine learning use two different sets of parameters: the training parameters and the meta-parameters (hyperparameters). While the training parameters are learned during the training phase, the values of the hyperparameters have to be specified before learning starts. For a given dataset, we would like to find the optimal combination of hyperparameter values, in a reasonable amount of time. This is a challenging task because of its computational complexity. In previous work [11], we introduced the Weighted Random Search (WRS) method, a combination of Random Search (RS) and probabilistic greedy heuristic. In the current paper, we compare the WRS method with several state-of-the art hyperparameter optimization methods with respect to Convolutional Neural Network (CNN) hyperparameter optimization. The criterion is the classification accuracy achieved within the same number of tested combinations of hyperparameter values. According to our experiments, the WRS algorithm outperforms the other methods.
Tasks	Hyperparameter Optimization
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13300v1
PDF	https://arxiv.org/pdf/2003.13300v1.pdf
PWC	https://paperswithcode.com/paper/weighted-random-search-for-cnn-hyperparameter
Repo
Framework

Optimization of genomic classifiers for clinical deployment: evaluation of Bayesian optimization for identification of predictive models of acute infection and in-hospital mortality


Title	Optimization of genomic classifiers for clinical deployment: evaluation of Bayesian optimization for identification of predictive models of acute infection and in-hospital mortality
Authors	Michael B. Mayhew, Elizabeth Tran, Kirindi Choi, Uros Midic, Roland Luethy, Nandita Damaraju, Ljubomir Buturovic
Abstract	Acute infection, if not rapidly and accurately detected, can lead to sepsis, organ failure and even death. Currently, detection of acute infection as well as assessment of a patient’s severity of illness are based on imperfect (and often superficial) measures of patient physiology. Characterization of a patient’s immune response by quantifying expression levels of key genes from blood represents a potentially more timely and precise means of accomplishing both tasks. Machine learning methods provide a platform for development of deployment-ready classification models robust to the smaller, more heterogeneous datasets typical of healthcare. Identification of promising classifiers is dependent, in part, on hyperparameter optimization (HO), for which a number of approaches including grid search, random sampling and Bayesian optimization have been shown to be effective. In this analysis, we compare HO approaches for the development of diagnostic classifiers of acute infection and in-hospital mortality from gene expression of 29 diagnostic markers. Our comprehensive analysis of a multi-study patient cohort evaluates HO for three different classifier types and over a range of different optimization settings. Consistent with previous research, we find that Bayesian optimization is more efficient than grid search or random sampling-based methods, identifying promising classifiers with fewer evaluated hyperparameter configurations. However, we also find evidence of a lack of correspondence between internal and external validation performance of selected classifiers that complicates model selection for deployment as well as stymies development of clear-cut, practical guidelines for HO application in healthcare. We highlight the need for additional considerations about patient heterogeneity, dataset partitioning and optimization setup when applying HO methods in the healthcare context.
Tasks	Hyperparameter Optimization, Model Selection
Published	2020-03-27
URL	https://arxiv.org/abs/2003.12310v1
PDF	https://arxiv.org/pdf/2003.12310v1.pdf
PWC	https://paperswithcode.com/paper/optimization-of-genomic-classifiers-for
Repo
Framework

Random Features Strengthen Graph Neural Networks


Title	Random Features Strengthen Graph Neural Networks
Authors	Ryoma Sato, Makoto Yamada, Hisashi Kashima
Abstract	Graph neural networks (GNNs) are powerful machine learning models for various graph learning tasks. Recently, the limitations of the expressive power of various GNN models have been revealed. For example, GNNs cannot distinguish some non-isomorphic graphs and they cannot learn efficient graph algorithms, and several GNN models have been proposed to overcome these limitations. In this paper, we demonstrate that GNNs become powerful just by adding a random feature to each node. We prove that the random features enable GNNs to learn almost optimal polynomial-time approximation algorithms for the minimum dominating set problem and maximum matching problem in terms of the approximation ratio. The main advantage of our method is that it can be combined with off-the-shelf GNN models with slight modifications. Through experiments, we show that the addition of random features enables GNNs to solve various problems that normal GNNs, including GCNs and GINs, cannot solve.
Tasks
Published	2020-02-08
URL	https://arxiv.org/abs/2002.03155v2
PDF	https://arxiv.org/pdf/2002.03155v2.pdf
PWC	https://paperswithcode.com/paper/random-features-strengthen-graph-neural
Repo
Framework

Rethinking the Hyperparameters for Fine-tuning


Title	Rethinking the Hyperparameters for Fine-tuning
Authors	Hao Li, Pratik Chaudhari, Hao Yang, Michael Lam, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto
Abstract	Fine-tuning from pre-trained ImageNet models has become the de-facto standard for various computer vision tasks. Current practices for fine-tuning typically involve selecting an ad-hoc choice of hyperparameters and keeping them fixed to values normally used for training from scratch. This paper re-examines several common practices of setting hyperparameters for fine-tuning. Our findings are based on extensive empirical evaluation for fine-tuning on various transfer learning benchmarks. (1) While prior works have thoroughly investigated learning rate and batch size, momentum for fine-tuning is a relatively unexplored parameter. We find that the value of momentum also affects fine-tuning performance and connect it with previous theoretical findings. (2) Optimal hyperparameters for fine-tuning, in particular, the effective learning rate, are not only dataset dependent but also sensitive to the similarity between the source domain and target domain. This is in contrast to hyperparameters for training from scratch. (3) Reference-based regularization that keeps models close to the initial model does not necessarily apply for “dissimilar” datasets. Our findings challenge common practices of fine-tuning and encourages deep learning practitioners to rethink the hyperparameters for fine-tuning.
Tasks	Transfer Learning
Published	2020-02-19
URL	https://arxiv.org/abs/2002.11770v1
PDF	https://arxiv.org/pdf/2002.11770v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-the-hyperparameters-for-fine-1
Repo
Framework

Implicit differentiation of Lasso-type models for hyperparameter optimization


Title	Implicit differentiation of Lasso-type models for hyperparameter optimization
Authors	Quentin Bertrand, Quentin Klopfenstein, Mathieu Blondel, Samuel Vaiter, Alexandre Gramfort, Joseph Salmon
Abstract	Setting regularization parameters for Lasso-type estimators is notoriously difficult, though crucial in practice. The most popular hyperparameter optimization approach is grid-search using held-out validation data. Grid-search however requires to choose a predefined grid for each parameter, which scales exponentially in the number of parameters. Another approach is to cast hyperparameter optimization as a bi-level optimization problem, one can solve by gradient descent. The key challenge for these methods is the estimation of the gradient with respect to the hyperparameters. Computing this gradient via forward or backward automatic differentiation is possible yet usually suffers from high memory consumption. Alternatively implicit differentiation typically involves solving a linear system which can be prohibitive and numerically unstable in high dimension. In addition, implicit differentiation usually assumes smooth loss functions, which is not the case for Lasso-type problems. This work introduces an efficient implicit differentiation algorithm, without matrix inversion, tailored for Lasso-type problems. Our approach scales to high-dimensional data by leveraging the sparsity of the solutions. Experiments demonstrate that the proposed method outperforms a large number of standard methods to optimize the error on held-out data, or the Stein Unbiased Risk Estimator (SURE).
Tasks	Hyperparameter Optimization
Published	2020-02-20
URL	https://arxiv.org/abs/2002.08943v1
PDF	https://arxiv.org/pdf/2002.08943v1.pdf
PWC	https://paperswithcode.com/paper/implicit-differentiation-of-lasso-type-models
Repo
Framework

Global Convergence and Geometric Characterization of Slow to Fast Weight Evolution in Neural Network Training for Classifying Linearly Non-Separable Data


Title	Global Convergence and Geometric Characterization of Slow to Fast Weight Evolution in Neural Network Training for Classifying Linearly Non-Separable Data
Authors	Ziang Long, Penghang Yin, Jack Xin
Abstract	In this paper, we study the dynamics of gradient descent in learning neural networks for classification problems. Unlike in existing works, we consider the linearly non-separable case where the training data of different classes lie in orthogonal subspaces. We show that when the network has sufficient (but not exceedingly large) number of neurons, (1) the corresponding minimization problem has a desirable landscape where all critical points are global minima with perfect classification; (2) gradient descent is guaranteed to converge to the global minima in this case. Moreover, we discovered a geometric condition on the network weights so that when it is satisfied, the weight evolution transitions from a slow phase of weight direction spreading to a fast phase of weight convergence. The geometric condition says that the convex hull of the weights projected on the unit sphere contains the origin.
Tasks
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12563v2
PDF	https://arxiv.org/pdf/2002.12563v2.pdf
PWC	https://paperswithcode.com/paper/global-convergence-and-geometric
Repo
Framework

AutoFoley: Artificial Synthesis of Synchronized Sound Tracks for Silent Videos with Deep Learning


Title	AutoFoley: Artificial Synthesis of Synchronized Sound Tracks for Silent Videos with Deep Learning
Authors	Sanchita Ghose, John J. Prevost
Abstract	In movie productions, the Foley Artist is responsible for creating an overlay soundtrack that helps the movie come alive for the audience. This requires the artist to first identify the sounds that will enhance the experience for the listener thereby reinforcing the Directors’s intention for a given scene. In this paper, we present AutoFoley, a fully-automated deep learning tool that can be used to synthesize a representative audio track for videos. AutoFoley can be used in the applications where there is either no corresponding audio file associated with the video or in cases where there is a need to identify critical scenarios and provide a synthesized, reinforced soundtrack. An important performance criterion of the synthesized soundtrack is to be time-synchronized with the input video, which provides for a realistic and believable portrayal of the synthesized sound. Unlike existing sound prediction and generation architectures, our algorithm is capable of precise recognition of actions as well as inter-frame relations in fast moving video clips by incorporating an interpolation technique and Temporal Relationship Networks (TRN). We employ a robust multi-scale Recurrent Neural Network (RNN) associated with a Convolutional Neural Network (CNN) for a better understanding of the intricate input-to-output associations over time. To evaluate AutoFoley, we create and introduce a large scale audio-video dataset containing a variety of sounds frequently used as Foley effects in movies. Our experiments show that the synthesized sounds are realistically portrayed with accurate temporal synchronization of the associated visual inputs. Human qualitative testing of AutoFoley show over 73% of the test subjects considered the generated soundtrack as original, which is a noteworthy improvement in cross-modal research in sound synthesis.
Tasks
Published	2020-02-21
URL	https://arxiv.org/abs/2002.10981v1
PDF	https://arxiv.org/pdf/2002.10981v1.pdf
PWC	https://paperswithcode.com/paper/autofoley-artificial-synthesis-of
Repo
Framework

Multi-Task Multicriteria Hyperparameter Optimization


Title	Multi-Task Multicriteria Hyperparameter Optimization
Authors	Kirill Akhmetzyanov, Alexander Yuzhakov
Abstract	We present a new method for searching optimal hyperparameters among several tasks and several criteria. Multi-Task Multi Criteria method (MTMC) provides several Pareto-optimal solutions, among which one solution is selected with given criteria significance coefficients. The article begins with a mathematical formulation of the problem of choosing optimal hyperparameters. Then, the steps of the MTMC method that solves this problem are described. The proposed method is evaluated on the image classification problem using a convolutional neural network. The article presents optimal hyperparameters for various criteria significance coefficients.
Tasks	Hyperparameter Optimization, Image Classification
Published	2020-02-15
URL	https://arxiv.org/abs/2002.06372v1
PDF	https://arxiv.org/pdf/2002.06372v1.pdf
PWC	https://paperswithcode.com/paper/multi-task-multicriteria-hyperparameter
Repo
Framework