Paper Group ANR 1234
Asymptotic learning curves of kernel methods: empirical data v.s. Teacher-Student paradigm. Eyenet: Attention based Convolutional Encoder-Decoder Network for Eye Region Segmentation. Efficient ConvNet-based Object Detection for Unmanned Aerial Vehicles by Selective Tile Processing. Learning Interactive Behaviors for Musculoskeletal Robots Using Bay …
Asymptotic learning curves of kernel methods: empirical data v.s. Teacher-Student paradigm
Title | Asymptotic learning curves of kernel methods: empirical data v.s. Teacher-Student paradigm |
Authors | Stefano Spigler, Mario Geiger, Matthieu Wyart |
Abstract | How many training data are needed to learn a supervised task? It is often observed that the generalization error decreases as $n^{-\beta}$ where $n$ is the number of training examples and $\beta$ an exponent that depends on both data and algorithm. In this work we measure $\beta$ when applying kernel methods to real datasets. For MNIST we find $\beta\approx 0.4$ and for CIFAR10 $\beta\approx 0.1$. Remarkably, $\beta$ is the same for regression and classification tasks, and for Gaussian or Laplace kernels. To rationalize the existence of non-trivial exponents that can be independent of the specific kernel used, we introduce the Teacher-Student framework for kernels. In this scheme, a Teacher generates data according to a Gaussian random field, and a Student learns them via kernel regression. With a simplifying assumption — namely that the data are sampled from a regular lattice — we derive analytically $\beta$ for translation invariant kernels, using previous results from the kriging literature. Provided that the Student is not too sensitive to high frequencies, $\beta$ depends only on the training data and their dimension. We confirm numerically that these predictions hold when the training points are sampled at random on a hypersphere. Overall, our results quantify how smooth Gaussian data should be to avoid the curse of dimensionality, and indicate that for kernel learning the relevant dimension of the data should be defined in terms of how the distance between nearest data points depends on $n$. With this definition one obtains reasonable effective smoothness estimates for MNIST and CIFAR10. |
Tasks | |
Published | 2019-05-26 |
URL | https://arxiv.org/abs/1905.10843v5 |
https://arxiv.org/pdf/1905.10843v5.pdf | |
PWC | https://paperswithcode.com/paper/asymptotic-learning-curves-of-kernel-methods |
Repo | |
Framework | |
Eyenet: Attention based Convolutional Encoder-Decoder Network for Eye Region Segmentation
Title | Eyenet: Attention based Convolutional Encoder-Decoder Network for Eye Region Segmentation |
Authors | Priya Kansal, Sabari Nathan |
Abstract | With the immersive development in the field of augmented and virtual reality, accurate and speedy eye-tracking is required. Facebook Research has organized a challenge, named OpenEDS Semantic Segmentation challenge for per-pixel segmentation of the key eye regions: the sclera, the iris, the pupil, and everything else (background). There are two constraints set for the participants viz MIOU and the computational complexity of the model. More recently, researchers have achieved quite a good result using the convolutional neural networks (CNN) in segmenting eyeregions. However, the environmental challenges involved in this task such as low resolution, blur, unusual glint and, illumination, off-angles, off-axis, use of glasses and different color of iris region hinder the accuracy of segmentation. To address the challenges in eye segmentation, the present work proposes a robust and computationally efficient attention-based convolutional encoder-decoder network for segmenting all the eye regions. Our model, named EyeNet, includes modified residual units as the backbone, two types of attention blocks and multi-scale supervision for segmenting the aforesaid four eye regions. Our proposed model achieved a total score of 0.974(EDS Evaluation metric) on test data, which demonstrates superior results compared to the baseline methods. |
Tasks | Eye Tracking, Semantic Segmentation |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03274v1 |
https://arxiv.org/pdf/1910.03274v1.pdf | |
PWC | https://paperswithcode.com/paper/eyenet-attention-based-convolutional-encoder |
Repo | |
Framework | |
Efficient ConvNet-based Object Detection for Unmanned Aerial Vehicles by Selective Tile Processing
Title | Efficient ConvNet-based Object Detection for Unmanned Aerial Vehicles by Selective Tile Processing |
Authors | George Plastiras, Christos Kyrkou, Theocharis Theocharides |
Abstract | Many applications utilizing Unmanned Aerial Vehicles (UAVs) require the use of computer vision algorithms to analyze the information captured from their on-board camera. Recent advances in deep learning have made it possible to use single-shot Convolutional Neural Network (CNN) detection algorithms that process the input image to detect various objects of interest. To keep the computational demands low these neural networks typically operate on small image sizes which, however, makes it difficult to detect small objects. This is further emphasized when considering UAVs equipped with cameras where due to the viewing range, objects tend to appear relatively small. This paper therefore, explores the trade-offs involved when maintaining the resolution of the objects of interest by extracting smaller patches (tiles) from the larger input image and processing them using a neural network. Specifically, we introduce an attention mechanism to focus on detecting objects only in some of the tiles and a memory mechanism to keep track of information for tiles that are not processed. Through the analysis of different methods and experiments we show that by carefully selecting which tiles to process we can considerably improve the detection accuracy while maintaining comparable performance to CNNs that resize and process a single image which makes the proposed approach suitable for UAV applications. |
Tasks | Object Detection |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.06073v1 |
https://arxiv.org/pdf/1911.06073v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-convnet-based-object-detection-for |
Repo | |
Framework | |
Learning Interactive Behaviors for Musculoskeletal Robots Using Bayesian Interaction Primitives
Title | Learning Interactive Behaviors for Musculoskeletal Robots Using Bayesian Interaction Primitives |
Authors | Joseph Campbell, Arne Hitzmann, Simon Stepputtis, Shuhei Ikemoto, Koh Hosoda, Heni Ben Amor |
Abstract | Musculoskeletal robots that are based on pneumatic actuation have a variety of properties, such as compliance and back-drivability, that render them particularly appealing for human-robot collaboration. However, programming interactive and responsive behaviors for such systems is extremely challenging due to the nonlinearity and uncertainty inherent to their control. In this paper, we propose an approach for learning Bayesian Interaction Primitives for musculoskeletal robots given a limited set of example demonstrations. We show that this approach is capable of real-time state estimation and response generation for interaction with a robot for which no analytical model exists. Human-robot interaction experiments on a ‘handshake’ task show that the approach generalizes to new positions, interaction partners, and movement velocities. |
Tasks | |
Published | 2019-08-15 |
URL | https://arxiv.org/abs/1908.05552v1 |
https://arxiv.org/pdf/1908.05552v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-interactive-behaviors-for |
Repo | |
Framework | |
Transfer Learning from Partial Annotations for Whole Brain Segmentation
Title | Transfer Learning from Partial Annotations for Whole Brain Segmentation |
Authors | Chengliang Dai, Yuanhan Mo, Elsa Angelini, Yike Guo, Wenjia Bai |
Abstract | Brain MR image segmentation is a key task in neuroimaging studies. It is commonly conducted using standard computational tools, such as FSL, SPM, multi-atlas segmentation etc, which are often registration-based and suffer from expensive computation cost. Recently, there is an increased interest using deep neural networks for brain image segmentation, which have demonstrated advantages in both speed and performance. However, neural networks-based approaches normally require a large amount of manual annotations for optimising the massive amount of network parameters. For 3D networks used in volumetric image segmentation, this has become a particular challenge, as a 3D network consists of many more parameters compared to its 2D counterpart. Manual annotation of 3D brain images is extremely time-consuming and requires extensive involvement of trained experts. To address the challenge with limited manual annotations, here we propose a novel multi-task learning framework for brain image segmentation, which utilises a large amount of automatically generated partial annotations together with a small set of manually created full annotations for network training. Our method yields a high performance comparable to state-of-the-art methods for whole brain segmentation. |
Tasks | Brain Image Segmentation, Brain Segmentation, Multi-Task Learning, Semantic Segmentation, Transfer Learning |
Published | 2019-08-28 |
URL | https://arxiv.org/abs/1908.10851v1 |
https://arxiv.org/pdf/1908.10851v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-from-partial-annotations |
Repo | |
Framework | |
Detecting Alzheimer’s Disease by estimating attention and elicitation path through the alignment of spoken picture descriptions with the picture prompt
Title | Detecting Alzheimer’s Disease by estimating attention and elicitation path through the alignment of spoken picture descriptions with the picture prompt |
Authors | Bahman Mirheidari, Yilin Pan, Traci Walker, Markus Reuber, Annalena Venneri, Daniel Blackburn, Heidi Christensen |
Abstract | Cognitive decline is a sign of Alzheimer’s disease (AD), and there is evidence that tracking a person’s eye movement, using eye tracking devices, can be used for the automatic identification of early signs of cognitive decline. However, such devices are expensive and may not be easy-to-use for people with cognitive problems. In this paper, we present a new way of capturing similar visual features, by using the speech of people describing the Cookie Theft picture - a common cognitive testing task - to identify regions in the picture prompt that will have caught the speaker’s attention and elicited their speech. After aligning the automatically recognised words with different regions of the picture prompt, we extract information inspired by eye tracking metrics such as coordinates of the area of interests (AOI)s, time spent in AOI, time to reach the AOI, and the number of AOI visits. Using the DementiaBank dataset we train a binary classifier (AD vs. healthy control) using 10-fold cross-validation and achieve an 80% F1-score using the timing information from the forced alignments of the automatic speech recogniser (ASR); this achieved around 72% using the timing information from the ASR outputs. |
Tasks | Eye Tracking |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00515v1 |
https://arxiv.org/pdf/1910.00515v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-alzheimers-disease-by-estimating |
Repo | |
Framework | |
Continual Learning in Practice
Title | Continual Learning in Practice |
Authors | Tom Diethe, Tom Borchert, Eno Thereska, Borja Balle, Neil Lawrence |
Abstract | This paper describes a reference architecture for self-maintaining systems that can learn continually, as data arrives. In environments where data evolves, we need architectures that manage Machine Learning (ML) models in production, adapt to shifting data distributions, cope with outliers, retrain when necessary, and adapt to new tasks. This represents continual AutoML or Automatically Adaptive Machine Learning. We describe the challenges and proposes a reference architecture. |
Tasks | AutoML, Continual Learning |
Published | 2019-03-12 |
URL | http://arxiv.org/abs/1903.05202v2 |
http://arxiv.org/pdf/1903.05202v2.pdf | |
PWC | https://paperswithcode.com/paper/continual-learning-in-practice |
Repo | |
Framework | |
Runtime Analysis of Fitness-Proportionate Selection on Linear Functions
Title | Runtime Analysis of Fitness-Proportionate Selection on Linear Functions |
Authors | Duc-Cuong Dang, Anton Eremeev, Per Kristian Lehre |
Abstract | This paper extends the runtime analysis of non-elitist evolutionary algorithms (EAs) with fitness-proportionate selection from the simple OneMax function to the linear functions. Not only does our analysis cover a larger class of fitness functions, it also holds for a wider range of mutation rates. We show that with overwhelmingly high probability, no linear function can be optimised in less than exponential time, assuming bitwise mutation rate $\Theta(1/n)$ and population size $\lambda=n^k$ for any constant $k>2$. In contrast to this negative result, we also show that for any linear function with polynomially bounded weights, the EA achieves a polynomial expected runtime if the mutation rate is reduced to $\Theta(1/n^2)$ and the population size is sufficiently large. Furthermore, the EA with mutation rate $\chi/n=\Theta(1/n)$ and modest population size $\lambda=\Omega(\ln n)$ optimises the scaled fitness function $e^{(\chi+\varepsilon)f(x)}$ for any linear function $f$ and any $\varepsilon>0$ in expected time $O(n\lambda\ln\lambda+n^2)$. These upper bounds also extend to some additively decomposed fitness functions, such as the Royal Road functions. We expect that the obtained results may be useful not only for the development of the theory of evolutionary algorithms, but also for biological applications, such as the directed evolution. |
Tasks | |
Published | 2019-08-23 |
URL | https://arxiv.org/abs/1908.08686v1 |
https://arxiv.org/pdf/1908.08686v1.pdf | |
PWC | https://paperswithcode.com/paper/runtime-analysis-of-fitness-proportionate |
Repo | |
Framework | |
Towards Making a Dependency Parser See
Title | Towards Making a Dependency Parser See |
Authors | Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez |
Abstract | We explore whether it is possible to leverage eye-tracking data in an RNN dependency parser (for English) when such information is only available during training, i.e., no aggregated or token-level gaze features are used at inference time. To do so, we train a multitask learning model that parses sentences as sequence labeling and leverages gaze features as auxiliary tasks. Our method also learns to train from disjoint datasets, i.e. it can be used to test whether already collected gaze features are useful to improve the performance on new non-gazed annotated treebanks. Accuracy gains are modest but positive, showing the feasibility of the approach. It can serve as a first step towards architectures that can better leverage eye-tracking data or other complementary information available only for training sentences, possibly leading to improvements in syntactic parsing. |
Tasks | Eye Tracking |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.01053v1 |
https://arxiv.org/pdf/1909.01053v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-making-a-dependency-parser-see |
Repo | |
Framework | |
Region Tracking in an Image Sequence: Preventing Driver Inattention
Title | Region Tracking in an Image Sequence: Preventing Driver Inattention |
Authors | Matthew Kowal, Gillian Sandison, Len Yabuki-Soh, Raner la Bastide |
Abstract | Driver inattention is a large problem on the roads around the world. The objective of this project was to develop an eye tracking algorithm with sufficient computational efficiency and accuracy, to successfully realize when the driver was looking away from the road for an extended period. The method of tracking involved the minimization of a functional, using the gradient descent and level set methods. The algorithm was then discretized and implemented using C and MATLAB. Multiple synthetic images, grey-scale and colour images were tested using the final design, with a desired region coverage of 82%. Further work is needed to decrease the computation time, increase the robustness of the algorithm, develop a small device capable of running the algorithm, as well as physically implement this device into various vehicles. |
Tasks | Eye Tracking |
Published | 2019-08-23 |
URL | https://arxiv.org/abs/1908.08914v1 |
https://arxiv.org/pdf/1908.08914v1.pdf | |
PWC | https://paperswithcode.com/paper/region-tracking-in-an-image-sequence |
Repo | |
Framework | |
cFineGAN: Unsupervised multi-conditional fine-grained image generation
Title | cFineGAN: Unsupervised multi-conditional fine-grained image generation |
Authors | Gunjan Aggarwal, Abhishek Sinha |
Abstract | We propose an unsupervised multi-conditional image generation pipeline: cFineGAN, that can generate an image conditioned on two input images such that the generated image preserves the texture of one and the shape of the other input. To achieve this goal, we extend upon the recently proposed work of FineGAN \citep{singh2018finegan} and make use of standard as well as shape-biased pre-trained ImageNet models. We demonstrate both qualitatively as well as quantitatively the benefit of using the shape-biased network. We present our image generation result across three benchmark datasets- CUB-200-2011, Stanford Dogs and UT Zappos50k. |
Tasks | Conditional Image Generation, Image Generation |
Published | 2019-12-06 |
URL | https://arxiv.org/abs/1912.05028v1 |
https://arxiv.org/pdf/1912.05028v1.pdf | |
PWC | https://paperswithcode.com/paper/cfinegan-unsupervised-multi-conditional-fine |
Repo | |
Framework | |
Deep Lifetime Clustering
Title | Deep Lifetime Clustering |
Authors | S Chandra Mouli, Leonardo Teixeira, Jennifer Neville, Bruno Ribeiro |
Abstract | The goal of lifetime clustering is to develop an inductive model that maps subjects into $K$ clusters according to their underlying (unobserved) lifetime distribution. We introduce a neural-network based lifetime clustering model that can find cluster assignments by directly maximizing the divergence between the empirical lifetime distributions of the clusters. Accordingly, we define a novel clustering loss function over the lifetime distributions (of entire clusters) based on a tight upper bound of the two-sample Kuiper test p-value. The resultant model is robust to the modeling issues associated with the unobservability of termination signals, and does not assume proportional hazards. Our results in real and synthetic datasets show significantly better lifetime clusters (as evaluated by C-index, Brier Score, Logrank score and adjusted Rand index) as compared to competing approaches. |
Tasks | |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00547v2 |
https://arxiv.org/pdf/1910.00547v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-lifetime-clustering-1 |
Repo | |
Framework | |
Structure Learning with Similarity Preserving
Title | Structure Learning with Similarity Preserving |
Authors | Zhao Kang, Xiao Lu, Yiwei Lu, Chong Peng, Zenglin Xu |
Abstract | Leveraging on the underlying low-dimensional structure of data, low-rank and sparse modeling approaches have achieved great success in a wide range of applications. However, in many applications the data can display structures beyond simply being low-rank or sparse. Fully extracting and exploiting hidden structure information in the data is always desirable and favorable. To reveal more underlying effective manifold structure, in this paper, we explicitly model the data relation. Specifically, we propose a structure learning framework that retains the pairwise similarities between the data points. Rather than just trying to reconstruct the original data based on self-expression, we also manage to reconstruct the kernel matrix, which functions as similarity preserving. Consequently, this technique is particularly suitable for the class of learning problems that are sensitive to sample similarity, e.g., clustering and semisupervised classification. To take advantage of representation power of deep neural network, a deep auto-encoder architecture is further designed to implement our model. Extensive experiments on benchmark data sets demonstrate that our proposed framework can consistently and significantly improve performance on both evaluation tasks. We conclude that the quality of structure learning can be enhanced if similarity information is incorporated. |
Tasks | |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01197v1 |
https://arxiv.org/pdf/1912.01197v1.pdf | |
PWC | https://paperswithcode.com/paper/structure-learning-with-similarity-preserving |
Repo | |
Framework | |
Recovery of Superquadrics from Range Images using Deep Learning: A Preliminary Study
Title | Recovery of Superquadrics from Range Images using Deep Learning: A Preliminary Study |
Authors | Tim Oblak, Klemen Grm, Aleš Jaklič, Peter Peer, Vitomir Štruc, Franc Solina |
Abstract | It has been a longstanding goal in computer vision to describe the 3D physical space in terms of parameterized volumetric models that would allow autonomous machines to understand and interact with their surroundings. Such models are typically motivated by human visual perception and aim to represents all elements of the physical word ranging from individual objects to complex scenes using a small set of parameters. One of the de facto stadards to approach this problem are superquadrics - volumetric models that define various 3D shape primitives and can be fitted to actual 3D data (either in the form of point clouds or range images). However, existing solutions to superquadric recovery involve costly iterative fitting procedures, which limit the applicability of such techniques in practice. To alleviate this problem, we explore in this paper the possibility to recover superquadrics from range images without time consuming iterative parameter estimation techniques by using contemporary deep-learning models, more specifically, convolutional neural networks (CNNs). We pose the superquadric recovery problem as a regression task and develop a CNN regressor that is able to estimate the parameters of a superquadric model from a given range image. We train the regressor on a large set of synthetic range images, each containing a single (unrotated) superquadric shape and evaluate the learned model in comparaitve experiments with the current state-of-the-art. Additionally, we also present a qualitative analysis involving a dataset of real-world objects. The results of our experiments show that the proposed regressor not only outperforms the existing state-of-the-art, but also ensures a 270x faster execution time. |
Tasks | |
Published | 2019-04-13 |
URL | https://arxiv.org/abs/1904.06585v2 |
https://arxiv.org/pdf/1904.06585v2.pdf | |
PWC | https://paperswithcode.com/paper/recovery-of-superquadrics-from-range-images |
Repo | |
Framework | |
Stochastic Conditional Generative Networks with Basis Decomposition
Title | Stochastic Conditional Generative Networks with Basis Decomposition |
Authors | Ze Wang, Xiuyuan Cheng, Guillermo Sapiro, Qiang Qiu |
Abstract | While generative adversarial networks (GANs) have revolutionized machine learning, a number of open questions remain to fully understand them and exploit their power. One of these questions is how to efficiently achieve proper diversity and sampling of the multi-mode data space. To address this, we introduce BasisGAN, a stochastic conditional multi-mode image generator. By exploiting the observation that a convolutional filter can be well approximated as a linear combination of a small set of basis elements, we learn a plug-and-played basis generator to stochastically generate basis elements, with just a few hundred of parameters, to fully embed stochasticity into convolutional filters. By sampling basis elements instead of filters, we dramatically reduce the cost of modeling the parameter space with no sacrifice on either image diversity or fidelity. To illustrate this proposed plug-and-play framework, we construct variants of BasisGAN based on state-of-the-art conditional image generation networks, and train the networks by simply plugging in a basis generator, without additional auxiliary components, hyperparameters, or training objectives. The experimental success is complemented with theoretical results indicating how the perturbations introduced by the proposed sampling of basis elements can propagate to the appearance of generated images. |
Tasks | Conditional Image Generation, Image Generation |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11286v2 |
https://arxiv.org/pdf/1909.11286v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-conditional-generative-networks |
Repo | |
Framework | |