January 25, 2020

3331 words 16 mins read

Paper Group ANR 1751

Weakly Supervised Recognition of Surgical Gestures. Self-supervised representation learning from electroencephalography signals. Evolving Spiking Neural Networks for Nonlinear Control Problems. Is This The Right Place? Geometric-Semantic Pose Verification for Indoor Visual Localization. Actively Learning Gaussian Process Dynamics. Improving Neural …

Weakly Supervised Recognition of Surgical Gestures


Title	Weakly Supervised Recognition of Surgical Gestures
Authors	Beatrice van Amsterdam, Hirenkumar Nakawala, Elena De Momi, Danail Stoyanov
Abstract	Kinematic trajectories recorded from surgical robots contain information about surgical gestures and potentially encode cues about surgeon’s skill levels. Automatic segmentation of these trajectories into meaningful action units could help to develop new metrics for surgical skill assessment as well as to simplify surgical automation. State-of-the-art methods for action recognition relied on manual labelling of large datasets, which is time consuming and error prone. Unsupervised methods have been developed to overcome these limitations. However, they often rely on tedious parameter tuning and perform less well than supervised approaches, especially on data with high variability such as surgical trajectories. Hence, the potential of weak supervision could be to improve unsupervised learning while avoiding manual annotation of large datasets. In this paper, we used at a minimum one expert demonstration and its ground truth annotations to generate an appropriate initialization for a GMM-based algorithm for gesture recognition. We showed on real surgical demonstrations that the latter significantly outperforms standard task-agnostic initialization methods. We also demonstrated how to improve the recognition accuracy further by redefining the actions and optimising the inputs.
Tasks	Gesture Recognition
Published	2019-07-25
URL	https://arxiv.org/abs/1907.10993v1
PDF	https://arxiv.org/pdf/1907.10993v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-recognition-of-surgical
Repo
Framework

Self-supervised representation learning from electroencephalography signals


Title	Self-supervised representation learning from electroencephalography signals
Authors	Hubert Banville, Isabela Albuquerque, Aapo Hyvärinen, Graeme Moffat, Denis-Alexander Engemann, Alexandre Gramfort
Abstract	The supervised learning paradigm is limited by the cost - and sometimes the impracticality - of data collection and labeling in multiple domains. Self-supervised learning, a paradigm which exploits the structure of unlabeled data to create learning problems that can be solved with standard supervised approaches, has shown great promise as a pretraining or feature learning approach in fields like computer vision and time series processing. In this work, we present self-supervision strategies that can be used to learn informative representations from multivariate time series. One successful approach relies on predicting whether time windows are sampled from the same temporal context or not. As demonstrated on a clinically relevant task (sleep scoring) and with two electroencephalography datasets, our approach outperforms a purely supervised approach in low data regimes, while capturing important physiological information without any access to labels.
Tasks	Representation Learning, Time Series
Published	2019-11-13
URL	https://arxiv.org/abs/1911.05419v1
PDF	https://arxiv.org/pdf/1911.05419v1.pdf
PWC	https://paperswithcode.com/paper/self-supervised-representation-learning-from-2
Repo
Framework

Evolving Spiking Neural Networks for Nonlinear Control Problems


Title	Evolving Spiking Neural Networks for Nonlinear Control Problems
Authors	Huanneng Qiu, Matthew Garratt, David Howard, Sreenatha Anavatti
Abstract	Spiking Neural Networks are powerful computational modelling tools that have attracted much interest because of the bioinspired modelling of synaptic interactions between neurons. Most of the research employing spiking neurons has been non-behavioural and discontinuous. Comparatively, this paper presents a recurrent spiking controller that is capable of solving nonlinear control problems in continuous domains using a popular topology evolution algorithm as the learning mechanism. We propose two mechanisms necessary to the decoding of continuous signals from discrete spike transmission: (i) a background current component to maintain frequency sufficiency for spike rate decoding, and (ii) a general network structure that derives strength from topology evolution. We demonstrate that the proposed spiking controller can learn significantly faster to discover functional solutions than sigmoidal neural networks in solving a classic nonlinear control problem.
Tasks
Published	2019-03-04
URL	http://arxiv.org/abs/1903.01180v1
PDF	http://arxiv.org/pdf/1903.01180v1.pdf
PWC	https://paperswithcode.com/paper/evolving-spiking-neural-networks-for
Repo
Framework

Is This The Right Place? Geometric-Semantic Pose Verification for Indoor Visual Localization


Title	Is This The Right Place? Geometric-Semantic Pose Verification for Indoor Visual Localization
Authors	Hajime Taira, Ignacio Rocco, Jiri Sedlar, Masatoshi Okutomi, Josef Sivic, Tomas Pajdla, Torsten Sattler, Akihiko Torii
Abstract	Visual localization in large and complex indoor scenes, dominated by weakly textured rooms and repeating geometric patterns, is a challenging problem with high practical relevance for applications such as Augmented Reality and robotics. To handle the ambiguities arising in this scenario, a common strategy is, first, to generate multiple estimates for the camera pose from which a given query image was taken. The pose with the largest geometric consistency with the query image, e.g., in the form of an inlier count, is then selected in a second stage. While a significant amount of research has concentrated on the first stage, there is considerably less work on the second stage. In this paper, we thus focus on pose verification. We show that combining different modalities, namely appearance, geometry, and semantics, considerably boosts pose verification and consequently pose accuracy. We develop multiple hand-crafted as well as a trainable approach to join into the geometric-semantic verification and show significant improvements over state-of-the-art on a very challenging indoor dataset.
Tasks	Visual Localization
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04598v2
PDF	https://arxiv.org/pdf/1908.04598v2.pdf
PWC	https://paperswithcode.com/paper/is-this-the-right-place-geometric-semantic
Repo
Framework

Actively Learning Gaussian Process Dynamics


Title	Actively Learning Gaussian Process Dynamics
Authors	Mona Buisson-Fenet, Friedrich Solowjow, Sebastian Trimpe
Abstract	Despite the availability of ever more data enabled through modern sensor and computer technology, it still remains an open problem to learn dynamical systems in a sample-efficient way. We propose active learning strategies that leverage information-theoretical properties arising naturally during Gaussian process regression, while respecting constraints on the sampling process imposed by the system dynamics. Sample points are selected in regions with high uncertainty, leading to exploratory behavior and data-efficient training of the model. All results are finally verified in an extensive numerical benchmark.
Tasks	Active Learning
Published	2019-11-22
URL	https://arxiv.org/abs/1911.09946v1
PDF	https://arxiv.org/pdf/1911.09946v1.pdf
PWC	https://paperswithcode.com/paper/actively-learning-gaussian-process-dynamics
Repo
Framework

Improving Neural Relation Extraction with Implicit Mutual Relations


Title	Improving Neural Relation Extraction with Implicit Mutual Relations
Authors	Jun Kuang, Yixin Cao, Jianbing Zheng, Xiangnan He, Ming Gao, Aoying Zhou
Abstract	Relation extraction (RE) aims at extracting the relation between two entities from the text corpora. It is a crucial task for Knowledge Graph (KG) construction. Most existing methods predict the relation between an entity pair by learning the relation from the training sentences, which contain the targeted entity pair. In contrast to existing distant supervision approaches that suffer from insufficient training corpora to extract relations, our proposal of mining implicit mutual relation from the massive unlabeled corpora transfers the semantic information of entity pairs into the RE model, which is more expressive and semantically plausible. After constructing an entity proximity graph based on the implicit mutual relations, we preserve the semantic relations of entity pairs via embedding each vertex of the graph into a low-dimensional space. As a result, we can easily and flexibly integrate the implicit mutual relations and other entity information, such as entity types, into the existing RE methods. Our experimental results on a New York Times and another Google Distant Supervision datasets suggest that our proposed neural RE framework provides a promising improvement for the RE task, and significantly outperforms the state-of-the-art methods. Moreover, the component for mining implicit mutual relations is so flexible that can help to improve the performance of both CNN-based and RNN-based RE models significant.
Tasks	Relation Extraction
Published	2019-07-08
URL	https://arxiv.org/abs/1907.05333v1
PDF	https://arxiv.org/pdf/1907.05333v1.pdf
PWC	https://paperswithcode.com/paper/improving-neural-relation-extraction-with
Repo
Framework

Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review


Title	Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review
Authors	Seyedmostafa Sheikhalishahi, Riccardo Miotto, Joel T Dudley, Alberto Lavelli, Fabio Rinaldi, Venet Osmani
Abstract	Of the 2652 articles considered, 106 met the inclusion criteria. Review of the included papers resulted in identification of 43 chronic diseases, which were then further classified into 10 disease categories using ICD-10. The majority of studies focused on diseases of the circulatory system (n=38) while endocrine and metabolic diseases were fewest (n=14). This was due to the structure of clinical records related to metabolic diseases, which typically contain much more structured data, compared with medical records for diseases of the circulatory system, which focus more on unstructured data and consequently have seen a stronger focus of NLP. The review has shown that there is a significant increase in the use of machine learning methods compared to rule-based approaches; however, deep learning methods remain emergent (n=3). Consequently, the majority of works focus on classification of disease phenotype with only a handful of papers addressing extraction of comorbidities from the free text or integration of clinical notes with structured data. There is a notable use of relatively simple methods, such as shallow classifiers (or combination with rule-based methods), due to the interpretability of predictions, which still represents a significant issue for more complex methods. Finally, scarcity of publicly available data may also have contributed to insufficient development of more advanced methods, such as extraction of word embeddings from clinical notes. Further efforts are still required to improve (1) progression of clinical NLP methods from extraction toward understanding; (2) recognition of relations among entities rather than entities in isolation; (3) temporal extraction to understand past, current, and future clinical events; (4) exploitation of alternative sources of clinical knowledge; and (5) availability of large-scale, de-identified clinical corpora.
Tasks	Word Embeddings
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05780v1
PDF	https://arxiv.org/pdf/1908.05780v1.pdf
PWC	https://paperswithcode.com/paper/natural-language-processing-of-clinical-notes
Repo
Framework

On Explaining Machine Learning Models by Evolving Crucial and Compact Features


Title	On Explaining Machine Learning Models by Evolving Crucial and Compact Features
Authors	Marco Virgolin, Tanja Alderliesten, Peter A. N. Bosman
Abstract	Feature construction can substantially improve the accuracy of Machine Learning (ML) algorithms. Genetic Programming (GP) has been proven to be effective at this task by evolving non-linear combinations of input features. GP additionally has the potential to improve ML explainability since explicit expressions are evolved. Yet, in most GP works the complexity of evolved features is not explicitly bound or minimized though this is arguably key for explainability. In this article, we assess to what extent GP still performs favorably at feature construction when constructing features that are (1) Of small-enough number, to enable visualization of the behavior of the ML model; (2) Of small-enough size, to enable interpretability of the features themselves; (3) Of sufficient informative power, to retain or even improve the performance of the ML algorithm. We consider a simple feature construction scheme using three different GP algorithms, as well as random search, to evolve features for five ML algorithms, including support vector machines and random forest. Our results on 21 datasets pertaining to classification and regression problems show that constructing only two compact features can be sufficient to rival the use of the entire original feature set. We further find that a modern GP algorithm, GP-GOMEA, performs best overall. These results, combined with examples that we provide of readable constructed features and of 2D visualizations of ML behavior, lead us to positively conclude that GP-based feature construction still works well when explicitly searching for compact features, making it extremely helpful to explain ML models.
Tasks
Published	2019-07-04
URL	https://arxiv.org/abs/1907.02260v3
PDF	https://arxiv.org/pdf/1907.02260v3.pdf
PWC	https://paperswithcode.com/paper/on-explaining-machine-learning-models-by
Repo
Framework

RankSRGAN: Generative Adversarial Networks with Ranker for Image Super-Resolution


Title	RankSRGAN: Generative Adversarial Networks with Ranker for Image Super-Resolution
Authors	Wenlong Zhang, Yihao Liu, Chao Dong, Yu Qiao
Abstract	Generative Adversarial Networks (GAN) have demonstrated the potential to recover realistic details for single image super-resolution (SISR). To further improve the visual quality of super-resolved results, PIRM2018-SR Challenge employed perceptual metrics to assess the perceptual quality, such as PI, NIQE, and Ma. However, existing methods cannot directly optimize these indifferentiable perceptual metrics, which are shown to be highly correlated with human ratings. To address the problem, we propose Super-Resolution Generative Adversarial Networks with Ranker (RankSRGAN) to optimize generator in the direction of perceptual metrics. Specifically, we first train a Ranker which can learn the behavior of perceptual metrics and then introduce a novel rank-content loss to optimize the perceptual quality. The most appealing part is that the proposed method can combine the strengths of different SR methods to generate better results. Extensive experiments show that RankSRGAN achieves visually pleasing results and reaches state-of-the-art performance in perceptual metrics. Project page: https://wenlongzhang0724.github.io/Projects/RankSRGAN
Tasks	Image Super-Resolution, Super-Resolution
Published	2019-08-18
URL	https://arxiv.org/abs/1908.06382v2
PDF	https://arxiv.org/pdf/1908.06382v2.pdf
PWC	https://paperswithcode.com/paper/ranksrgan-generative-adversarial-networks
Repo
Framework

Place recognition in gardens by learning visual representations: data set and benchmark analysis


Title	Place recognition in gardens by learning visual representations: data set and benchmark analysis
Authors	Maria Leyva-Vallina, Nicola Strisciuglio, Nicolai Petkov
Abstract	Visual place recognition is an important component of systems for camera localization and loop closure detection. It concerns the recognition of a previously visited place based on visual cues only. Although it is a widely studied problem for indoor and urban environments, the recent use of robots for automation of agricultural and gardening tasks has created new problems, due to the challenging appearance of garden-like environments. Garden scenes predominantly contain green colors, as well as repetitive patterns and textures. The lack of available data recorded in gardens and natural environments makes the improvement of visual localization algorithms difficult. In this paper we propose an extended version of the TB-Places data set, which is designed for testing algorithms for visual place recognition. It contains images with ground truth camera pose recorded in real gardens in different seasons, with varying light conditions. We constructed and released a ground truth for all possible pairs of images, indicating whether they depict the same place or not. We present the results of a benchmark analysis of methods based on convolutional neural networks for holistic image description and place recognition. We train existing networks (i.e. ResNet, DenseNet and VGG NetVLAD) as backbone of a two-way architecture with a contrastive loss function. The results that we obtained demonstrate that learning garden-tailored representations contribute to an improvement of performance, although the generalization capabilities are limited.
Tasks	Camera Localization, Loop Closure Detection, Visual Localization, Visual Place Recognition
Published	2019-06-28
URL	https://arxiv.org/abs/1906.12151v1
PDF	https://arxiv.org/pdf/1906.12151v1.pdf
PWC	https://paperswithcode.com/paper/place-recognition-in-gardens-by-learning
Repo
Framework

Variational Autoencoded Regression: High Dimensional Regression of Visual Data on Complex Manifold


Title	Variational Autoencoded Regression: High Dimensional Regression of Visual Data on Complex Manifold
Authors	YoungJoon Yoo, Sangdoo Yun, Hyung Jin Chang, Yiannis Demiris, Jin Young Choi
Abstract	This paper proposes a new high dimensional regression method by merging Gaussian process regression into a variational autoencoder framework. In contrast to other regression methods, the proposed method focuses on the case where output responses are on a complex high dimensional manifold, such as images. Our contributions are summarized as follows: (i) A new regression method estimating high dimensional image responses, which is not handled by existing regression algorithms, is proposed. (ii) The proposed regression method introduces a strategy to learn the latent space as well as the encoder and decoder so that the result of the regressed response in the latent space coincide with the corresponding response in the data space. (iii) The proposed regression is embedded into a generative model, and the whole procedure is developed by the variational autoencoder framework. We demonstrate the robustness and effectiveness of our method through a number of experiments on various visual data regression problems.
Tasks
Published	2019-08-12
URL	https://arxiv.org/abs/1908.04015v1
PDF	https://arxiv.org/pdf/1908.04015v1.pdf
PWC	https://paperswithcode.com/paper/variational-autoencoded-regression-high-1
Repo
Framework

Learning Hierarchical Feature Space Using CLAss-specific Subspace Multiple Kernel – Metric Learning for Classification


Title	Learning Hierarchical Feature Space Using CLAss-specific Subspace Multiple Kernel – Metric Learning for Classification
Authors	Yinan Yu, Tomas McKelvey
Abstract	Metric learning for classification has been intensively studied over the last decade. The idea is to learn a metric space induced from a normed vector space on which data from different classes are well separated. Different measures of the separation thus lead to various designs of the objective function in the metric learning model. One classical metric is the Mahalanobis distance, where a linear transformation matrix is designed and applied on the original dataset to obtain a new subspace equipped with the Euclidean norm. The kernelized version has also been developed, followed by Multiple-Kernel learning models. In this paper, we consider metric learning to be the identification of the best kernel function with respect to a high class separability in the corresponding metric space. The contribution is twofold: 1) No pairwise computations are required as in most metric learning techniques; 2) Better flexibility and lower computational complexity is achieved using the CLAss-Specific (Multiple) Kernel - Metric Learning (CLAS(M)K-ML). The proposed techniques can be considered as a preprocessing step to any kernel method or kernel approximation technique. An extension to a hierarchical learning structure is also proposed to further improve the classification performance, where on each layer, the CLASMK is computed based on a selected “marginal” subset and feature vectors are constructed by concatenating the features from all previous layers.
Tasks	Metric Learning
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09309v1
PDF	https://arxiv.org/pdf/1910.09309v1.pdf
PWC	https://paperswithcode.com/paper/learning-hierarchical-feature-space-using
Repo
Framework

Scene Motion Decomposition for Learnable Visual Odometry


Title	Scene Motion Decomposition for Learnable Visual Odometry
Authors	Igor Slinko, Anna Vorontsova, Filipp Konokhov, Olga Barinova, Anton Konushin
Abstract	Optical Flow (OF) and depth are commonly used for visual odometry since they provide sufficient information about camera ego-motion in a rigid scene. We reformulate the problem of ego-motion estimation as a problem of motion estimation of a 3D-scene with respect to a static camera. The entire scene motion can be represented as a combination of motions of its visible points. Using OF and depth we estimate a motion of each point in terms of 6DoF and represent results in the form of motion maps, each one addressing single degree of freedom. In this work we provide motion maps as inputs to a deep neural network that predicts 6DoF of scene motion. Through our evaluation on outdoor and indoor datasets we show that utilizing motion maps leads to accuracy improvement in comparison with naive stacking of depth and OF. Another contribution of our work is a novel network architecture that efficiently exploits motion maps and outperforms learnable RGB/RGB-D baselines.
Tasks	Motion Estimation, Optical Flow Estimation, Visual Odometry
Published	2019-07-16
URL	https://arxiv.org/abs/1907.07227v1
PDF	https://arxiv.org/pdf/1907.07227v1.pdf
PWC	https://paperswithcode.com/paper/scene-motion-decomposition-for-learnable
Repo
Framework

New optimization algorithms for neural network training using operator splitting techniques


Title	New optimization algorithms for neural network training using operator splitting techniques
Authors	Cristian Daniel Alecsa, Titus Pinta, Imre Boros
Abstract	In the following paper we present a new type of optimization algorithms adapted for neural network training. These algorithms are based upon sequential operator splitting technique for some associated dynamical systems. Furthermore, we investigate through numerical simulations the empirical rate of convergence of these iterative schemes toward a local minimum of the loss function, with some suitable choices of the underlying hyper-parameters. We validate the convergence of these optimizers using the results of the accuracy and of the loss function on the MNIST, MNIST-Fashion and CIFAR 10 classification datasets.
Tasks
Published	2019-04-29
URL	https://arxiv.org/abs/1904.12952v5
PDF	https://arxiv.org/pdf/1904.12952v5.pdf
PWC	https://paperswithcode.com/paper/new-optimization-algorithms-for-neural
Repo
Framework

The Kernel Interaction Trick: Fast Bayesian Discovery of Pairwise Interactions in High Dimensions


Title	The Kernel Interaction Trick: Fast Bayesian Discovery of Pairwise Interactions in High Dimensions
Authors	Raj Agrawal, Jonathan H. Huggins, Brian Trippe, Tamara Broderick
Abstract	Discovering interaction effects on a response of interest is a fundamental problem faced in biology, medicine, economics, and many other scientific disciplines. In theory, Bayesian methods for discovering pairwise interactions enjoy many benefits such as coherent uncertainty quantification, the ability to incorporate background knowledge, and desirable shrinkage properties. In practice, however, Bayesian methods are often computationally intractable for even moderate-dimensional problems. Our key insight is that many hierarchical models of practical interest admit a particular Gaussian process (GP) representation; the GP allows us to capture the posterior with a vector of O(p) kernel hyper-parameters rather than O(p^2) interactions and main effects. With the implicit representation, we can run Markov chain Monte Carlo (MCMC) over model hyper-parameters in time and memory linear in p per iteration. We focus on sparsity-inducing models and show on datasets with a variety of covariate behaviors that our method: (1) reduces runtime by orders of magnitude over naive applications of MCMC, (2) provides lower Type I and Type II error relative to state-of-the-art LASSO-based approaches, and (3) offers improved computational scaling in high dimensions relative to existing Bayesian and LASSO-based approaches.
Tasks
Published	2019-05-16
URL	https://arxiv.org/abs/1905.06501v2
PDF	https://arxiv.org/pdf/1905.06501v2.pdf
PWC	https://paperswithcode.com/paper/the-kernel-interaction-trick-fast-bayesian
Repo
Framework