July 28, 2019

2964 words 14 mins read

Paper Group ANR 314

Paper Group ANR 314

Dynamic Graph Convolutional Networks. A Color Quantization Optimization Approach for Image Representation Learning. Real-time visual tracking by deep reinforced decision making. Clustering Stable Instances of Euclidean k-means. A Survey on Deep Learning in Medical Image Analysis. Combining Real-Valued and Binary Gabor-Radon Features for Classificat …

Dynamic Graph Convolutional Networks

Title Dynamic Graph Convolutional Networks
Authors Franco Manessi, Alessandro Rozza, Mario Manzo
Abstract Many different classification tasks need to manage structured data, which are usually modeled as graphs. Moreover, these graphs can be dynamic, meaning that the vertices/edges of each graph may change during time. Our goal is to jointly exploit structured data and temporal information through the use of a neural network model. To the best of our knowledge, this task has not been addressed using these kind of architectures. For this reason, we propose two novel approaches, which combine Long Short-Term Memory networks and Graph Convolutional Networks to learn long short-term dependencies together with graph structure. The quality of our methods is confirmed by the promising results achieved.
Tasks
Published 2017-04-20
URL http://arxiv.org/abs/1704.06199v1
PDF http://arxiv.org/pdf/1704.06199v1.pdf
PWC https://paperswithcode.com/paper/dynamic-graph-convolutional-networks
Repo
Framework

A Color Quantization Optimization Approach for Image Representation Learning

Title A Color Quantization Optimization Approach for Image Representation Learning
Authors É. M. D. A. Pereira, J. A. dos Santos
Abstract Over the last two decades, hand-crafted feature extractors have been used in order to compose image representations. Recently, data-driven feature learning have been explored as a way of producing more representative visual features. In this work, we proposed two approaches to learn image visual representations which aims at providing more effective and compact image representations. Our strategy employs Genetic Algorithms to improve hand-crafted feature extraction algorithms by optimizing colour quantization for the image domain. Our hypothesis is that changes in the quantization affect the description quality of the features enabling representation improvements. We conducted a series of experiments in order to evaluate the robustness of the proposed approaches in the task of content-based image retrieval in eight well-known datasets from different visual properties. Experimental results indicated that the approach focused on representation effectiveness outperformed the baselines in all the tested scenarios. The other approach, more focused on compactness, was able to produce competitive results by keeping or even reducing the final feature dimensionality until 25% smaller with statistically equivalent performance.
Tasks Content-Based Image Retrieval, Image Retrieval, Quantization, Representation Learning
Published 2017-11-18
URL http://arxiv.org/abs/1711.06809v2
PDF http://arxiv.org/pdf/1711.06809v2.pdf
PWC https://paperswithcode.com/paper/a-color-quantization-optimization-approach
Repo
Framework

Real-time visual tracking by deep reinforced decision making

Title Real-time visual tracking by deep reinforced decision making
Authors Janghoon Choi, Junseok Kwon, Kyoung Mu Lee
Abstract One of the major challenges of model-free visual tracking problem has been the difficulty originating from the unpredictable and drastic changes in the appearance of objects we target to track. Existing methods tackle this problem by updating the appearance model on-line in order to adapt to the changes in the appearance. Despite the success of these methods however, inaccurate and erroneous updates of the appearance model result in a tracker drift. In this paper, we introduce a novel real-time visual tracking algorithm based on a template selection strategy constructed by deep reinforcement learning methods. The tracking algorithm utilizes this strategy to choose the appropriate template for tracking a given frame. The template selection strategy is self-learned by utilizing a simple policy gradient method on numerous training episodes randomly generated from a tracking benchmark dataset. Our proposed reinforcement learning framework is generally applicable to other confidence map based tracking algorithms. The experiment shows that our tracking algorithm runs in real-time speed of 43 fps and the proposed policy network effectively decides the appropriate template for successful visual tracking.
Tasks Decision Making, Real-Time Visual Tracking, Visual Tracking
Published 2017-02-21
URL http://arxiv.org/abs/1702.06291v2
PDF http://arxiv.org/pdf/1702.06291v2.pdf
PWC https://paperswithcode.com/paper/real-time-visual-tracking-by-deep-reinforced
Repo
Framework

Clustering Stable Instances of Euclidean k-means

Title Clustering Stable Instances of Euclidean k-means
Authors Abhratanu Dutta, Aravindan Vijayaraghavan, Alex Wang
Abstract The Euclidean k-means problem is arguably the most widely-studied clustering problem in machine learning. While the k-means objective is NP-hard in the worst-case, practitioners have enjoyed remarkable success in applying heuristics like Lloyd’s algorithm for this problem. To address this disconnect, we study the following question: what properties of real-world instances will enable us to design efficient algorithms and prove guarantees for finding the optimal clustering? We consider a natural notion called additive perturbation stability that we believe captures many practical instances. Stable instances have unique optimal k-means solutions that do not change even when each point is perturbed a little (in Euclidean distance). This captures the property that the k-means optimal solution should be tolerant to measurement errors and uncertainty in the points. We design efficient algorithms that provably recover the optimal clustering for instances that are additive perturbation stable. When the instance has some additional separation, we show an efficient algorithm with provable guarantees that is also robust to outliers. We complement these results by studying the amount of stability in real datasets and demonstrating that our algorithm performs well on these benchmark datasets.
Tasks
Published 2017-12-04
URL http://arxiv.org/abs/1712.01241v1
PDF http://arxiv.org/pdf/1712.01241v1.pdf
PWC https://paperswithcode.com/paper/clustering-stable-instances-of-euclidean-k-1
Repo
Framework

A Survey on Deep Learning in Medical Image Analysis

Title A Survey on Deep Learning in Medical Image Analysis
Authors Geert Litjens, Thijs Kooi, Babak Ehteshami Bejnordi, Arnaud Arindra Adiyoso Setio, Francesco Ciompi, Mohsen Ghafoorian, Jeroen A. W. M. van der Laak, Bram van Ginneken, Clara I. Sánchez
Abstract Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.
Tasks Image Classification, Object Detection
Published 2017-02-19
URL http://arxiv.org/abs/1702.05747v2
PDF http://arxiv.org/pdf/1702.05747v2.pdf
PWC https://paperswithcode.com/paper/a-survey-on-deep-learning-in-medical-image
Repo
Framework

Combining Real-Valued and Binary Gabor-Radon Features for Classification and Search in Medical Imaging Archives

Title Combining Real-Valued and Binary Gabor-Radon Features for Classification and Search in Medical Imaging Archives
Authors Hamed Erfankhah, Mehran Yazdi, H. R. Tizhoosh
Abstract Content-based image retrieval (CBIR) of medical images in large datasets to identify similar images when a query image is given can be very useful in improving the diagnostic decision of the clinical experts and as well in educational scenarios. In this paper, we used two stage classification and retrieval approach to retrieve similar images. First, the Gabor filters are applied to Radon-transformed images to extract features and to train a multi-class SVM. Then based on the classification results and using an extracted Gabor barcode, similar images are retrieved. The proposed method was tested on IRMA dataset which contains more than 14,000 images. Experimental results show the efficiency of our approach in retrieving similar images compared to other Gabor-Radon-oriented methods.
Tasks Content-Based Image Retrieval, Image Retrieval
Published 2017-09-27
URL http://arxiv.org/abs/1709.09754v1
PDF http://arxiv.org/pdf/1709.09754v1.pdf
PWC https://paperswithcode.com/paper/combining-real-valued-and-binary-gabor-radon
Repo
Framework

Boosting Variational Inference: an Optimization Perspective

Title Boosting Variational Inference: an Optimization Perspective
Authors Francesco Locatello, Rajiv Khanna, Joydeep Ghosh, Gunnar Rätsch
Abstract Variational inference is a popular technique to approximate a possibly intractable Bayesian posterior with a more tractable one. Recently, boosting variational inference has been proposed as a new paradigm to approximate the posterior by a mixture of densities by greedily adding components to the mixture. However, as is the case with many other variational inference algorithms, its theoretical properties have not been studied. In the present work, we study the convergence properties of this approach from a modern optimization viewpoint by establishing connections to the classic Frank-Wolfe algorithm. Our analyses yields novel theoretical insights regarding the sufficient conditions for convergence, explicit rates, and algorithmic simplifications. Since a lot of focus in previous works for variational inference has been on tractability, our work is especially important as a much needed attempt to bridge the gap between probabilistic models and their corresponding theoretical properties.
Tasks
Published 2017-08-05
URL http://arxiv.org/abs/1708.01733v2
PDF http://arxiv.org/pdf/1708.01733v2.pdf
PWC https://paperswithcode.com/paper/boosting-variational-inference-an
Repo
Framework

Source-Target Inference Models for Spatial Instruction Understanding

Title Source-Target Inference Models for Spatial Instruction Understanding
Authors Hao Tan, Mohit Bansal
Abstract Models that can execute natural language instructions for situated robotic tasks such as assembly and navigation have several useful applications in homes, offices, and remote scenarios. We study the semantics of spatially-referred configuration and arrangement instructions, based on the challenging Bisk-2016 blank-labeled block dataset. This task involves finding a source block and moving it to the target position (mentioned via a reference block and offset), where the blocks have no names or colors and are just referred to via spatial location features. We present novel models for the subtasks of source block classification and target position regression, based on joint-loss language and spatial-world representation learning, as well as CNN-based and dual attention models to compute the alignment between the world blocks and the instruction phrases. For target position prediction, we compare two inference approaches: annealed sampling via policy gradient versus expectation inference via supervised regression. Our models achieve the new state-of-the-art on this task, with an improvement of 47% on source block accuracy and 22% on target position distance.
Tasks Representation Learning
Published 2017-07-12
URL http://arxiv.org/abs/1707.03804v2
PDF http://arxiv.org/pdf/1707.03804v2.pdf
PWC https://paperswithcode.com/paper/source-target-inference-models-for-spatial
Repo
Framework

Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image

Title Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image
Authors Florian Chabot, Mohamed Chaouch, Jaonary Rabarisoa, Céline Teulière, Thierry Chateau
Abstract In this paper, we present a novel approach, called Deep MANTA (Deep Many-Tasks), for many-task vehicle analysis from a given image. A robust convolutional network is introduced for simultaneous vehicle detection, part localization, visibility characterization and 3D dimension estimation. Its architecture is based on a new coarse-to-fine object proposal that boosts the vehicle detection. Moreover, the Deep MANTA network is able to localize vehicle parts even if these parts are not visible. In the inference, the network’s outputs are used by a real time robust pose estimation algorithm for fine orientation estimation and 3D vehicle localization. We show in experiments that our method outperforms monocular state-of-the-art approaches on vehicle detection, orientation and 3D location tasks on the very challenging KITTI benchmark.
Tasks Pose Estimation
Published 2017-03-22
URL http://arxiv.org/abs/1703.07570v1
PDF http://arxiv.org/pdf/1703.07570v1.pdf
PWC https://paperswithcode.com/paper/deep-manta-a-coarse-to-fine-many-task-network
Repo
Framework

Geometric Convolutional Neural Network for Analyzing Surface-Based Neuroimaging Data

Title Geometric Convolutional Neural Network for Analyzing Surface-Based Neuroimaging Data
Authors Si-Baek Seong, Chongwon Pae, Hae-Jeong Park
Abstract The conventional CNN, widely used for two-dimensional images, however, is not directly applicable to non-regular geometric surface, such as a cortical thickness. We propose Geometric CNN (gCNN) that deals with data representation over a spherical surface and renders pattern recognition in a multi-shell mesh structure. The classification accuracy for sex was significantly higher than that of SVM and image based CNN. It only uses MRI thickness data to classify gender but this method can expand to classify disease from other MRI or fMRI data
Tasks
Published 2017-08-02
URL http://arxiv.org/abs/1708.00587v1
PDF http://arxiv.org/pdf/1708.00587v1.pdf
PWC https://paperswithcode.com/paper/geometric-convolutional-neural-network-for
Repo
Framework

Color and Gradient Features for Text Segmentation from Video Frames

Title Color and Gradient Features for Text Segmentation from Video Frames
Authors P. Shivakumara, D. S. Guru, H. T. Basavaraju
Abstract Text segmentation in a video is drawing attention of researchers in the field of image processing, pattern recognition and document image analysis because it helps in annotating and labeling video events accurately. We propose a novel idea of generating an enhanced frame from the R, G, and B channels of an input frame by grouping high and low values using Min-Max clustering criteria. We also perform sliding window on enhanced frame to group high and low values from the neighboring pixel values to further enhance the frame. Subsequently, we use k-means with k=2 clustering algorithm to separate text and non-text regions. The fully connected components will be identified in the skeleton of the frame obtained by k-means clustering. Concept of connected component analysis based on gradient feature has been adapted for the purpose of symmetry verification. The components which satisfy symmetric verification are selected to be the representatives of text regions and they are permitted to grow to cover their respective region fully containing text. The method is tested on variety of video frames to evaluate the performance of the method in terms of recall, precision and f-measure. The results show that method is promising and encouraging.
Tasks
Published 2017-08-22
URL http://arxiv.org/abs/1708.06561v1
PDF http://arxiv.org/pdf/1708.06561v1.pdf
PWC https://paperswithcode.com/paper/color-and-gradient-features-for-text
Repo
Framework

The Second Order Linear Model

Title The Second Order Linear Model
Authors Ming Lin, Shuang Qiu, Bin Hong, Jieping Ye
Abstract We study a fundamental class of regression models called the second order linear model (SLM). The SLM extends the linear model to high order functional space and has attracted considerable research interest recently. Yet how to efficiently learn the SLM under full generality using nonconvex solver still remains an open question due to several fundamental limitations of the conventional gradient descent learning framework. In this study, we try to attack this problem from a gradient-free approach which we call the moment-estimation-sequence (MES) method. We show that the conventional gradient descent heuristic is biased by the skewness of the distribution therefore is no longer the best practice of learning the SLM. Based on the MES framework, we design a nonconvex alternating iteration process to train a $d$-dimension rank-$k$ SLM within $O(kd)$ memory and one-pass of the dataset. The proposed method converges globally and linearly, achieves $\epsilon$ recovery error after retrieving $O[k^{2}d\cdot\mathrm{polylog}(kd/\epsilon)]$ samples. Furthermore, our theoretical analysis reveals that not all SLMs can be learned on every sub-gaussian distribution. When the instances are sampled from a so-called $\tau$-MIP distribution, the SLM can be learned by $O(p/\tau^{2})$ samples where $p$ and $\tau$ are positive constants depending on the skewness and kurtosis of the distribution. For non-MIP distribution, an addition diagonal-free oracle is necessary and sufficient to guarantee the learnability of the SLM. Numerical simulations verify the sharpness of our bounds on the sampling complexity and the linear convergence rate of our algorithm.
Tasks
Published 2017-03-02
URL http://arxiv.org/abs/1703.00598v3
PDF http://arxiv.org/pdf/1703.00598v3.pdf
PWC https://paperswithcode.com/paper/the-second-order-linear-model
Repo
Framework

Ensemble of heterogeneous flexible neural trees using multiobjective genetic programming

Title Ensemble of heterogeneous flexible neural trees using multiobjective genetic programming
Authors Varun Kumar Ojha, Ajith Abraham, Václav Snášel
Abstract Machine learning algorithms are inherently multiobjective in nature, where approximation error minimization and model’s complexity simplification are two conflicting objectives. We proposed a multiobjective genetic programming (MOGP) for creating a heterogeneous flexible neural tree (HFNT), tree-like flexible feedforward neural network model. The functional heterogeneity in neural tree nodes was introduced to capture a better insight of data during learning because each input in a dataset possess different features. MOGP guided an initial HFNT population towards Pareto-optimal solutions, where the final population was used for making an ensemble system. A diversity index measure along with approximation error and complexity was introduced to maintain diversity among the candidates in the population. Hence, the ensemble was created by using accurate, structurally simple, and diverse candidates from MOGP final population. Differential evolution algorithm was applied to fine-tune the underlying parameters of the selected candidates. A comprehensive test over classification, regression, and time-series datasets proved the efficiency of the proposed algorithm over other available prediction methods. Moreover, the heterogeneous creation of HFNT proved to be efficient in making ensemble system from the final population.
Tasks Time Series
Published 2017-05-16
URL http://arxiv.org/abs/1705.05592v1
PDF http://arxiv.org/pdf/1705.05592v1.pdf
PWC https://paperswithcode.com/paper/ensemble-of-heterogeneous-flexible-neural
Repo
Framework

End-to-End Learning of Video Super-Resolution with Motion Compensation

Title End-to-End Learning of Video Super-Resolution with Motion Compensation
Authors Osama Makansi, Eddy Ilg, Thomas Brox
Abstract Learning approaches have shown great success in the task of super-resolving an image given a low resolution input. Video super-resolution aims for exploiting additionally the information from multiple images. Typically, the images are related via optical flow and consecutive image warping. In this paper, we provide an end-to-end video super-resolution network that, in contrast to previous works, includes the estimation of optical flow in the overall network architecture. We analyze the usage of optical flow for video super-resolution and find that common off-the-shelf image warping does not allow video super-resolution to benefit much from optical flow. We rather propose an operation for motion compensation that performs warping from low to high resolution directly. We show that with this network configuration, video super-resolution can benefit from optical flow and we obtain state-of-the-art results on the popular test sets. We also show that the processing of whole images rather than independent patches is responsible for a large increase in accuracy.
Tasks Motion Compensation, Optical Flow Estimation, Super-Resolution, Video Super-Resolution
Published 2017-07-03
URL http://arxiv.org/abs/1707.00471v1
PDF http://arxiv.org/pdf/1707.00471v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-learning-of-video-super-resolution
Repo
Framework

A New Adaptive Video Super-Resolution Algorithm With Improved Robustness to Innovations

Title A New Adaptive Video Super-Resolution Algorithm With Improved Robustness to Innovations
Authors Ricardo Augusto Borsoi, Guilherme Holsbach Costa, José Carlos Moreira Bermudez
Abstract In this paper, a new video super-resolution reconstruction (SRR) method with improved robustness to outliers is proposed. Although the R-LMS is one of the SRR algorithms with the best reconstruction quality for its computational cost, and is naturally robust to registration inaccuracies, its performance is known to degrade severely in the presence of innovation outliers. By studying the proximal point cost function representation of the R-LMS iterative equation, a better understanding of its performance under different situations is attained. Using statistical properties of typical innovation outliers, a new cost function is then proposed and two new algorithms are derived, which present improved robustness to outliers while maintaining computational costs comparable to that of R-LMS. Monte Carlo simulation results illustrate that the proposed method outperforms the traditional and regularized versions of LMS, and is competitive with state-of-the-art SRR methods at a much smaller computational cost.
Tasks Super-Resolution, Video Super-Resolution
Published 2017-06-14
URL http://arxiv.org/abs/1706.04695v3
PDF http://arxiv.org/pdf/1706.04695v3.pdf
PWC https://paperswithcode.com/paper/a-new-adaptive-video-super-resolution
Repo
Framework
comments powered by Disqus