January 27, 2020

3028 words 15 mins read

Paper Group ANR 1155

Paper Group ANR 1155

Motion Capture from Pan-Tilt Cameras with Unknown Orientation. Diversity-Promoting Deep Reinforcement Learning for Interactive Recommendation. Risk of the Least Squares Minimum Norm Estimator under the Spike Covariance Model. DysLexML: Screening Tool for Dyslexia Using Machine Learning. Acoustically Grounded Word Embeddings for Improved Acoustics-t …

Motion Capture from Pan-Tilt Cameras with Unknown Orientation

Title Motion Capture from Pan-Tilt Cameras with Unknown Orientation
Authors Roman Bachmann, Jörg Spörri, Pascal Fua, Helge Rhodin
Abstract In sports, such as alpine skiing, coaches would like to know the speed and various biomechanical variables of their athletes and competitors. Existing methods use either body-worn sensors, which are cumbersome to setup, or manual image annotation, which is time consuming. We propose a method for estimating an athlete’s global 3D position and articulated pose using multiple cameras. By contrast to classical markerless motion capture solutions, we allow cameras to rotate freely so that large capture volumes can be covered. In a first step, tight crops around the skier are predicted and fed to a 2D pose estimator network. The 3D pose is then reconstructed using a bundle adjustment method. Key to our solution is the rotation estimation of Pan-Tilt cameras in a joint optimization with the athlete pose and conditioning on relative background motion computed with feature tracking. Furthermore, we created a new alpine skiing dataset and annotated it with 2D pose labels, to overcome shortcomings of existing ones. Our method estimates accurate global 3D poses from images only and provides coaches with an automatic and fast tool for measuring and improving an athlete’s performance.
Tasks Markerless Motion Capture, Motion Capture
Published 2019-08-30
URL https://arxiv.org/abs/1908.11676v1
PDF https://arxiv.org/pdf/1908.11676v1.pdf
PWC https://paperswithcode.com/paper/motion-capture-from-pan-tilt-cameras-with
Repo
Framework

Diversity-Promoting Deep Reinforcement Learning for Interactive Recommendation

Title Diversity-Promoting Deep Reinforcement Learning for Interactive Recommendation
Authors Yong Liu, Yinan Zhang, Qiong Wu, Chunyan Miao, Lizhen Cui, Binqiang Zhao, Yin Zhao, Lu Guan
Abstract Interactive recommendation that models the explicit interactions between users and the recommender system has attracted a lot of research attentions in recent years. Most previous interactive recommendation systems only focus on optimizing recommendation accuracy while overlooking other important aspects of recommendation quality, such as the diversity of recommendation results. In this paper, we propose a novel recommendation model, named \underline{D}iversity-promoting \underline{D}eep \underline{R}einforcement \underline{L}earning (D$^2$RL), which encourages the diversity of recommendation results in interaction recommendations. More specifically, we adopt a Determinantal Point Process (DPP) model to generate diverse, while relevant item recommendations. A personalized DPP kernel matrix is maintained for each user, which is constructed from two parts: a fixed similarity matrix capturing item-item similarity, and the relevance of items dynamically learnt through an actor-critic reinforcement learning framework. We performed extensive offline experiments as well as simulated online experiments with real world datasets to demonstrate the effectiveness of the proposed model.
Tasks Recommendation Systems
Published 2019-03-19
URL http://arxiv.org/abs/1903.07826v1
PDF http://arxiv.org/pdf/1903.07826v1.pdf
PWC https://paperswithcode.com/paper/diversity-promoting-deep-reinforcement
Repo
Framework

Risk of the Least Squares Minimum Norm Estimator under the Spike Covariance Model

Title Risk of the Least Squares Minimum Norm Estimator under the Spike Covariance Model
Authors Yasaman Mahdaviyeh, Zacharie Naulet
Abstract We study risk of the minimum norm linear least squares estimator in when the number of parameters $d$ depends on $n$, and $\frac{d}{n} \rightarrow \infty$. We assume that data has an underlying low rank structure by restricting ourselves to spike covariance matrices, where a fixed finite number of eigenvalues grow with $n$ and are much larger than the rest of the eigenvalues, which are (asymptotically) in the same order. We show that in this setting risk of minimum norm least squares estimator vanishes in compare to risk of the null estimator. We give asymptotic and non asymptotic upper bounds for this risk, and also leverage the assumption of spike model to give an analysis of the bias that leads to tighter bounds in compare to previous works.
Tasks
Published 2019-12-31
URL https://arxiv.org/abs/1912.13421v2
PDF https://arxiv.org/pdf/1912.13421v2.pdf
PWC https://paperswithcode.com/paper/asymptotic-risk-of-least-squares-minimum-norm
Repo
Framework

DysLexML: Screening Tool for Dyslexia Using Machine Learning

Title DysLexML: Screening Tool for Dyslexia Using Machine Learning
Authors Thomais Asvestopoulou, Victoria Manousaki, Antonis Psistakis, Ioannis Smyrnakis, Vassilios Andreadakis, Ioannis M. Aslanides, Maria Papadopouli
Abstract Eye movements during text reading can provide insights about reading disorders. Via eye-trackers, we can measure when, where and how eyes move with relation to the words they read. Machine Learning (ML) algorithms can decode this information and provide differential analysis. This work developed DysLexML, a screening tool for developmental dyslexia that applies various ML algorithms to analyze fixation points recorded via eye-tracking during silent reading of children. It comparatively evaluated its performance using measurements collected in a systematic field study with 69 native Greek speakers, children, 32 of which were diagnosed as dyslexic by the official governmental agency for diagnosing learning and reading difficulties in Greece. We examined a large set of features based on statistical properties of fixations and saccadic movements and identified the ones with prominent predictive power, performing dimensionality reduction. Specifically, DysLexML achieves its best performance using linear SVM, with an a accuracy of 97 %, with a small feature set, namely saccade length, number of short forward movements, and number of multiply fixated words. Furthermore, we analyzed the impact of noise on the fixation positions and showed that DysLexML is accurate and robust in the presence of noise. These encouraging results set the basis for developing screening tools in less controlled, larger-scale environments, with inexpensive eye-trackers, potentially reaching a larger population for early intervention.
Tasks Dimensionality Reduction, Eye Tracking
Published 2019-03-14
URL http://arxiv.org/abs/1903.06274v1
PDF http://arxiv.org/pdf/1903.06274v1.pdf
PWC https://paperswithcode.com/paper/dyslexml-screening-tool-for-dyslexia-using
Repo
Framework

Acoustically Grounded Word Embeddings for Improved Acoustics-to-Word Speech Recognition

Title Acoustically Grounded Word Embeddings for Improved Acoustics-to-Word Speech Recognition
Authors Shane Settle, Kartik Audhkhasi, Karen Livescu, Michael Picheny
Abstract Direct acoustics-to-word (A2W) systems for end-to-end automatic speech recognition are simpler to train, and more efficient to decode with, than sub-word systems. However, A2W systems can have difficulties at training time when data is limited, and at decoding time when recognizing words outside the training vocabulary. To address these shortcomings, we investigate the use of recently proposed acoustic and acoustically grounded word embedding techniques in A2W systems. The idea is based on treating the final pre-softmax weight matrix of an AWE recognizer as a matrix of word embedding vectors, and using an externally trained set of word embeddings to improve the quality of this matrix. In particular we introduce two ideas: (1) Enforcing similarity at training time between the external embeddings and the recognizer weights, and (2) using the word embeddings at test time for predicting out-of-vocabulary words. Our word embedding model is acoustically grounded, that is it is learned jointly with acoustic embeddings so as to encode the words’ acoustic-phonetic content; and it is parametric, so that it can embed any arbitrary (potentially out-of-vocabulary) sequence of characters. We find that both techniques improve the performance of an A2W recognizer on conversational telephone speech.
Tasks Speech Recognition, Word Embeddings
Published 2019-03-29
URL http://arxiv.org/abs/1903.12306v1
PDF http://arxiv.org/pdf/1903.12306v1.pdf
PWC https://paperswithcode.com/paper/acoustically-grounded-word-embeddings-for
Repo
Framework

Adaptive Anomaly Detection in Chaotic Time Series with a Spatially Aware Echo State Network

Title Adaptive Anomaly Detection in Chaotic Time Series with a Spatially Aware Echo State Network
Authors Niklas Heim, James E. Avery
Abstract This work builds an automated anomaly detection method for chaotic time series, and more concretely for turbulent, high-dimensional, ocean simulations. We solve this task by extending the Echo State Network by spatially aware input maps, such as convolutions, gradients, cosine transforms, et cetera, as well as a spatially aware loss function. The spatial ESN is used to create predictions which reduce the detection problem to thresholding of the prediction error. We benchmark our detection framework on different tasks of increasing difficulty to show the generality of the framework before applying it to raw climate model output in the region of the Japanese ocean current Kuroshio, which exhibits a bimodality that is not easily detected by the naked eye. The code is available as an open source Python package, Torsk, available at https://github.com/nmheim/torsk, where we also provide supplementary material and programs that reproduce the results shown in this paper.
Tasks Anomaly Detection, Time Series
Published 2019-09-02
URL https://arxiv.org/abs/1909.01709v1
PDF https://arxiv.org/pdf/1909.01709v1.pdf
PWC https://paperswithcode.com/paper/adaptive-anomaly-detection-in-chaotic-time
Repo
Framework

Ordinal Bayesian Optimisation

Title Ordinal Bayesian Optimisation
Authors Victor Picheny, Sattar Vakili, Artem Artemev
Abstract Bayesian optimisation is a powerful tool to solve expensive black-box problems, but fails when the stationary assumption made on the objective function is strongly violated, which is the case in particular for ill-conditioned or discontinuous objectives. We tackle this problem by proposing a new Bayesian optimisation framework that only considers the ordering of variables, both in the input and output spaces, to fit a Gaussian process in a latent space. By doing so, our approach is agnostic to the original metrics on the original spaces. We propose two algorithms, respectively based on an optimistic strategy and on Thompson sampling. For the optimistic strategy we prove an optimal performance under the measure of regret in the latent space. We illustrate the capability of our framework on several challenging toy problems.
Tasks Bayesian Optimisation
Published 2019-12-05
URL https://arxiv.org/abs/1912.02493v1
PDF https://arxiv.org/pdf/1912.02493v1.pdf
PWC https://paperswithcode.com/paper/ordinal-bayesian-optimisation
Repo
Framework

A Primer on PAC-Bayesian Learning

Title A Primer on PAC-Bayesian Learning
Authors Benjamin Guedj
Abstract Generalised Bayesian learning algorithms are increasingly popular in machine learning, due to their PAC generalisation properties and flexibility. The present paper aims at providing a self-contained survey on the resulting PAC-Bayes framework and some of its main theoretical and algorithmic developments.
Tasks
Published 2019-01-16
URL https://arxiv.org/abs/1901.05353v3
PDF https://arxiv.org/pdf/1901.05353v3.pdf
PWC https://paperswithcode.com/paper/a-primer-on-pac-bayesian-learning
Repo
Framework

OperatorNet: Recovering 3D Shapes From Difference Operators

Title OperatorNet: Recovering 3D Shapes From Difference Operators
Authors Ruqi Huang, Marie-Julie Rakotosaona, Panos Achlioptas, Leonidas Guibas, Maks Ovsjanikov
Abstract This paper proposes a learning-based framework for reconstructing 3D shapes from functional operators, compactly encoded as small-sized matrices. To this end we introduce a novel neural architecture, called OperatorNet, which takes as input a set of linear operators representing a shape and produces its 3D embedding. We demonstrate that this approach significantly outperforms previous purely geometric methods for the same problem. Furthermore, we introduce a novel functional operator, which encodes the extrinsic or pose-dependent shape information, and thus complements purely intrinsic pose-oblivious operators, such as the classical Laplacian. Coupled with this novel operator, our reconstruction network achieves very high reconstruction accuracy, even in the presence of incomplete information about a shape, given a soft or functional map expressed in a reduced basis. Finally, we demonstrate that the multiplicative functional algebra enjoyed by these operators can be used to synthesize entirely new unseen shapes, in the context of shape interpolation and shape analogy applications.
Tasks
Published 2019-04-24
URL https://arxiv.org/abs/1904.10754v2
PDF https://arxiv.org/pdf/1904.10754v2.pdf
PWC https://paperswithcode.com/paper/operatornet-recovering-3d-shapes-from
Repo
Framework

Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization

Title Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization
Authors Tomaso Poggio, Andrzej Banburski, Qianli Liao
Abstract While deep learning is successful in a number of applications, it is not yet well understood theoretically. A satisfactory theoretical characterization of deep learning however, is beginning to emerge. It covers the following questions: 1) representation power of deep networks 2) optimization of the empirical risk 3) generalization properties of gradient descent techniques — why the expected error does not suffer, despite the absence of explicit regularization, when the networks are overparametrized? In this review we discuss recent advances in the three areas. In approximation theory both shallow and deep networks have been shown to approximate any continuous functions on a bounded domain at the expense of an exponential number of parameters (exponential in the dimensionality of the function). However, for a subset of compositional functions, deep networks of the convolutional type can have a linear dependence on dimensionality, unlike shallow networks. In optimization we discuss the loss landscape for the exponential loss function and show that stochastic gradient descent will find with high probability the global minima. To address the question of generalization for classification tasks, we use classical uniform convergence results to justify minimizing a surrogate exponential-type loss function under a unit norm constraint on the weight matrix at each layer – since the interesting variables for classification are the weight directions rather than the weights. Our approach, which is supported by several independent new results, offers a solution to the puzzle about generalization performance of deep overparametrized ReLU networks, uncovering the origin of the underlying hidden complexity control.
Tasks
Published 2019-08-25
URL https://arxiv.org/abs/1908.09375v1
PDF https://arxiv.org/pdf/1908.09375v1.pdf
PWC https://paperswithcode.com/paper/theoretical-issues-in-deep-networks
Repo
Framework

UPI-Net: Semantic Contour Detection in Placental Ultrasound

Title UPI-Net: Semantic Contour Detection in Placental Ultrasound
Authors Huan Qi, Sally Collins, J. Alison Noble
Abstract Semantic contour detection is a challenging problem that is often met in medical imaging, of which placental image analysis is a particular example. In this paper, we investigate utero-placental interface (UPI) detection in 2D placental ultrasound images by formulating it as a semantic contour detection problem. As opposed to natural images, placental ultrasound images contain specific anatomical structures thus have unique geometry. We argue it would be beneficial for UPI detectors to incorporate global context modelling in order to reduce unwanted false positive UPI predictions. Our approach, namely UPI-Net, aims to capture long-range dependencies in placenta geometry through lightweight global context modelling and effective multi-scale feature aggregation. We perform a subject-level 10-fold nested cross-validation on a placental ultrasound database (4,871 images with labelled UPI from 49 scans). Experimental results demonstrate that, without introducing considerable computational overhead, UPI-Net yields the highest performance in terms of standard contour detection metrics, compared to other competitive benchmarks.
Tasks Contour Detection
Published 2019-08-31
URL https://arxiv.org/abs/1909.00229v2
PDF https://arxiv.org/pdf/1909.00229v2.pdf
PWC https://paperswithcode.com/paper/upi-net-semantic-contour-detection-in
Repo
Framework

Modeling Uncertainty by Learning a Hierarchy of Deep Neural Connections

Title Modeling Uncertainty by Learning a Hierarchy of Deep Neural Connections
Authors Raanan Y. Rohekar, Yaniv Gurwicz, Shami Nisimov, Gal Novik
Abstract Modeling uncertainty in deep neural networks, despite recent important advances, is still an open problem. Bayesian neural networks are a powerful solution, where the prior over network weights is a design choice, often a normal distribution or other distribution encouraging sparsity. However, this prior is agnostic to the generative process of the input data, which might lead to unwarranted generalization for out-of-distribution tested data. We suggest the presence of a confounder for the relation between the input data and the discriminative function given the target label. We propose an approach for modeling this confounder by sharing neural connectivity patterns between the generative and discriminative networks. This approach leads to a new deep architecture, where networks are sampled from the posterior of local causal structures, and coupled into a compact hierarchy. We demonstrate that sampling networks from this hierarchy, proportionally to their posterior, is efficient and enables estimating various types of uncertainties. Empirical evaluations of our method demonstrate significant improvement compared to state-of-the-art calibration and out-of-distribution detection methods.
Tasks Calibration, Out-of-Distribution Detection
Published 2019-05-30
URL https://arxiv.org/abs/1905.13195v2
PDF https://arxiv.org/pdf/1905.13195v2.pdf
PWC https://paperswithcode.com/paper/modeling-uncertainty-by-learning-a-hierarchy
Repo
Framework

Gradient Descent based Optimization Algorithms for Deep Learning Models Training

Title Gradient Descent based Optimization Algorithms for Deep Learning Models Training
Authors Jiawei Zhang
Abstract In this paper, we aim at providing an introduction to the gradient descent based optimization algorithms for learning deep neural network models. Deep learning models involving multiple nonlinear projection layers are very challenging to train. Nowadays, most of the deep learning model training still relies on the back propagation algorithm actually. In back propagation, the model variables will be updated iteratively until convergence with gradient descent based optimization algorithms. Besides the conventional vanilla gradient descent algorithm, many gradient descent variants have also been proposed in recent years to improve the learning performance, including Momentum, Adagrad, Adam, Gadam, etc., which will all be introduced in this paper respectively.
Tasks
Published 2019-03-11
URL http://arxiv.org/abs/1903.03614v1
PDF http://arxiv.org/pdf/1903.03614v1.pdf
PWC https://paperswithcode.com/paper/gradient-descent-based-optimization
Repo
Framework

Multi-view Clustering with the Cooperation of Visible and Hidden Views

Title Multi-view Clustering with the Cooperation of Visible and Hidden Views
Authors Zhaohong Deng, Ruixiu Liu, Te Zhang, Peng Xu, Kup-Sze Choi, Bin Qin, Shitong Wang
Abstract Multi-view data are becoming common in real-world modeling tasks and many multi-view data clustering algorithms have thus been proposed. The existing algorithms usually focus on the cooperation of different views in the original space but neglect the influence of the hidden information among these different visible views, or they only consider the hidden information between the views. The algorithms are therefore not efficient since the available information is not fully excavated, particularly the otherness information in different views and the consistency information between them. In practice, the otherness and consistency information in multi-view data are both very useful for effective clustering analyses. In this study, a Multi-View clustering algorithm developed with the Cooperation of Visible and Hidden views, i.e., MV-Co-VH, is proposed. The MV-Co-VH algorithm first projects the multiple views from different visible spaces to the common hidden space by using the non-negative matrix factorization (NMF) strategy to obtain the common hidden view data. Collaborative learning is then implemented in the clustering procedure based on the visible views and the shared hidden view. The results of extensive experiments on UCI multi-view datasets and real-world image multi-view datasets show that the clustering performance of the proposed algorithm is competitive with or even better than that of the existing algorithms.
Tasks
Published 2019-08-12
URL https://arxiv.org/abs/1908.04766v1
PDF https://arxiv.org/pdf/1908.04766v1.pdf
PWC https://paperswithcode.com/paper/multi-view-clustering-with-the-cooperation-of
Repo
Framework

Distance Assessment and Hypothesis Testing of High-Dimensional Samples using Variational Autoencoders

Title Distance Assessment and Hypothesis Testing of High-Dimensional Samples using Variational Autoencoders
Authors Marco Henrique de Almeida Inácio, Rafael Izbicki, Bálint Gyires-Tóth
Abstract Given two distinct datasets, an important question is if they have arisen from the the same data generating function or alternatively how their data generating functions diverge from one another. In this paper, we introduce an approach for measuring the distance between two datasets with high dimensionality using variational autoencoders. This approach is augmented by a permutation hypothesis test in order to check the hypothesis that the data generating distributions are the same within a significance level. We evaluate both the distance measurement and hypothesis testing approaches on generated and on public datasets. According to the results the proposed approach can be used for data exploration (e.g. by quantifying the discrepancy/separability between categories of images), which can be particularly useful in the early phases of the pipeline of most machine learning projects.
Tasks
Published 2019-09-16
URL https://arxiv.org/abs/1909.07182v1
PDF https://arxiv.org/pdf/1909.07182v1.pdf
PWC https://paperswithcode.com/paper/distance-assessment-and-hypothesis-testing-of
Repo
Framework
comments powered by Disqus