May 6, 2019

3543 words 17 mins read

Paper Group ANR 348

Paper Group ANR 348

Subspace Perspective on Canonical Correlation Analysis: Dimension Reduction and Minimax Rates. Discovering Picturesque Highlights from Egocentric Vacation Videos. Dense semantic labeling of sub-decimeter resolution images with convolutional neural networks. Multivariate mixture model for myocardium segmentation combining multi-source images. Unsupe …

Subspace Perspective on Canonical Correlation Analysis: Dimension Reduction and Minimax Rates

Title Subspace Perspective on Canonical Correlation Analysis: Dimension Reduction and Minimax Rates
Authors Zhuang Ma, Xiaodong Li
Abstract Canonical correlation analysis (CCA) is a fundamental statistical tool for exploring the correlation structure between two sets of random variables. In this paper, motivated by recent success of applying CCA to learn low dimensional representations of high dimensional objects, we propose to quantify the estimation loss of CCA by the excess prediction loss defined through a prediction-after-dimension-reduction framework. Such framework suggests viewing CCA estimation as estimating the subspaces spanned by the canonical variates. Interestedly, the proposed error metrics derived from the excess prediction loss turn out to be closely related to the principal angles between the subspaces spanned by the population and sample canonical variates respectively. We characterize the non-asymptotic minimax rates under the proposed metrics, especially the dependency of the minimax rates on the key quantities including the dimensions, the condition number of the covariance matrices, the canonical correlations and the eigen-gap, with minimal assumptions on the joint covariance matrix. To the best of our knowledge, this is the first finite sample result that captures the effect of the canonical correlations on the minimax rates.
Tasks Dimensionality Reduction
Published 2016-05-12
URL http://arxiv.org/abs/1605.03662v2
PDF http://arxiv.org/pdf/1605.03662v2.pdf
PWC https://paperswithcode.com/paper/subspace-perspective-on-canonical-correlation
Repo
Framework

Discovering Picturesque Highlights from Egocentric Vacation Videos

Title Discovering Picturesque Highlights from Egocentric Vacation Videos
Authors Vinay Bettadapura, Daniel Castro, Irfan Essa
Abstract We present an approach for identifying picturesque highlights from large amounts of egocentric video data. Given a set of egocentric videos captured over the course of a vacation, our method analyzes the videos and looks for images that have good picturesque and artistic properties. We introduce novel techniques to automatically determine aesthetic features such as composition, symmetry and color vibrancy in egocentric videos and rank the video frames based on their photographic qualities to generate highlights. Our approach also uses contextual information such as GPS, when available, to assess the relative importance of each geographic location where the vacation videos were shot. Furthermore, we specifically leverage the properties of egocentric videos to improve our highlight detection. We demonstrate results on a new egocentric vacation dataset which includes 26.5 hours of videos taken over a 14 day vacation that spans many famous tourist destinations and also provide results from a user-study to access our results.
Tasks
Published 2016-01-18
URL http://arxiv.org/abs/1601.04406v1
PDF http://arxiv.org/pdf/1601.04406v1.pdf
PWC https://paperswithcode.com/paper/discovering-picturesque-highlights-from
Repo
Framework

Dense semantic labeling of sub-decimeter resolution images with convolutional neural networks

Title Dense semantic labeling of sub-decimeter resolution images with convolutional neural networks
Authors Michele Volpi, Devis Tuia
Abstract Semantic labeling (or pixel-level land-cover classification) in ultra-high resolution imagery (< 10cm) requires statistical models able to learn high level concepts from spatial data, with large appearance variations. Convolutional Neural Networks (CNNs) achieve this goal by learning discriminatively a hierarchy of representations of increasing abstraction. In this paper we present a CNN-based system relying on an downsample-then-upsample architecture. Specifically, it first learns a rough spatial map of high-level representations by means of convolutions and then learns to upsample them back to the original resolution by deconvolutions. By doing so, the CNN learns to densely label every pixel at the original resolution of the image. This results in many advantages, including i) state-of-the-art numerical accuracy, ii) improved geometric accuracy of predictions and iii) high efficiency at inference time. We test the proposed system on the Vaihingen and Potsdam sub-decimeter resolution datasets, involving semantic labeling of aerial images of 9cm and 5cm resolution, respectively. These datasets are composed by many large and fully annotated tiles allowing an unbiased evaluation of models making use of spatial information. We do so by comparing two standard CNN architectures to the proposed one: standard patch classification, prediction of local label patches by employing only convolutions and full patch labeling by employing deconvolutions. All the systems compare favorably or outperform a state-of-the-art baseline relying on superpixels and powerful appearance descriptors. The proposed full patch labeling CNN outperforms these models by a large margin, also showing a very appealing inference time.
Tasks
Published 2016-08-02
URL http://arxiv.org/abs/1608.00775v2
PDF http://arxiv.org/pdf/1608.00775v2.pdf
PWC https://paperswithcode.com/paper/dense-semantic-labeling-of-sub-decimeter
Repo
Framework

Multivariate mixture model for myocardium segmentation combining multi-source images

Title Multivariate mixture model for myocardium segmentation combining multi-source images
Authors Xiahai Zhuang
Abstract This paper proposes a method for simultaneous segmentation of multi-source images, using the multivariate mixture model (MvMM) and maximum of log-likelihood (LL) framework. The segmentation is a procedure of texture classification, and the MvMM is used to model the joint intensity distribution of the images. Specifically, the method is applied to the myocardial segmentation combining the complementary texture information from multi-sequence (MS) cardiac magnetic resonance (CMR) images. Furthermore, there exist inter-image mis-registration and intra-image misalignment of slices in the MS CMR images. Hence, the MvMM is formulated with transformations, which are embedded into the LL framework and optimized simultaneously with the segmentation parameters. The proposed method is able to correct the inter- and intra-image misalignment by registering each slice of the MS CMR to a virtual common space, as well as to delineate the indistinguishable boundaries of myocardium consisting of pathologies. Results have shown statistically significant improvement in the segmentation performance of the proposed method with respect to the conventional approaches which can solely segment each image separately. The proposed method has also demonstrated better robustness in the incongruent data, where some images may not fully cover the region of interest and the full coverage can only be reconstructed combining the images from multiple sources.
Tasks Texture Classification
Published 2016-12-28
URL http://arxiv.org/abs/1612.08820v1
PDF http://arxiv.org/pdf/1612.08820v1.pdf
PWC https://paperswithcode.com/paper/multivariate-mixture-model-for-myocardium
Repo
Framework

Unsupervised Human Action Detection by Action Matching

Title Unsupervised Human Action Detection by Action Matching
Authors Basura Fernando, Sareh Shirazi, Stephen Gould
Abstract We propose a new task of unsupervised action detection by action matching. Given two long videos, the objective is to temporally detect all pairs of matching video segments. A pair of video segments are matched if they share the same human action. The task is category independent—it does not matter what action is being performed—and no supervision is used to discover such video segments. Unsupervised action detection by action matching allows us to align videos in a meaningful manner. As such, it can be used to discover new action categories or as an action proposal technique within, say, an action detection pipeline. Moreover, it is a useful pre-processing step for generating video highlights, e.g., from sports videos. We present an effective and efficient method for unsupervised action detection. We use an unsupervised temporal encoding method and exploit the temporal consistency in human actions to obtain candidate action segments. We evaluate our method on this challenging task using three activity recognition benchmarks, namely, the MPII Cooking activities dataset, the THUMOS15 action detection benchmark and a new dataset called the IKEA dataset. On the MPII Cooking dataset we detect action segments with a precision of 21.6% and recall of 11.7% over 946 long video pairs and over 5000 ground truth action segments. Similarly, on THUMOS dataset we obtain 18.4% precision and 25.1% recall over 5094 ground truth action segment pairs.
Tasks Action Detection, Activity Recognition
Published 2016-12-02
URL http://arxiv.org/abs/1612.00558v4
PDF http://arxiv.org/pdf/1612.00558v4.pdf
PWC https://paperswithcode.com/paper/unsupervised-human-action-detection-by-action
Repo
Framework

Neural Network Architecture Optimization through Submodularity and Supermodularity

Title Neural Network Architecture Optimization through Submodularity and Supermodularity
Authors Junqi Jin, Ziang Yan, Kun Fu, Nan Jiang, Changshui Zhang
Abstract Deep learning models’ architectures, including depth and width, are key factors influencing models’ performance, such as test accuracy and computation time. This paper solves two problems: given computation time budget, choose an architecture to maximize accuracy, and given accuracy requirement, choose an architecture to minimize computation time. We convert this architecture optimization into a subset selection problem. With accuracy’s submodularity and computation time’s supermodularity, we propose efficient greedy optimization algorithms. The experiments demonstrate our algorithm’s ability to find more accurate models or faster models. By analyzing architecture evolution with growing time budget, we discuss relationships among accuracy, time and architecture, and give suggestions on neural network architecture design.
Tasks
Published 2016-09-01
URL http://arxiv.org/abs/1609.00074v3
PDF http://arxiv.org/pdf/1609.00074v3.pdf
PWC https://paperswithcode.com/paper/neural-network-architecture-optimization
Repo
Framework

Sketching and Neural Networks

Title Sketching and Neural Networks
Authors Amit Daniely, Nevena Lazic, Yoram Singer, Kunal Talwar
Abstract High-dimensional sparse data present computational and statistical challenges for supervised learning. We propose compact linear sketches for reducing the dimensionality of the input, followed by a single layer neural network. We show that any sparse polynomial function can be computed, on nearly all sparse binary vectors, by a single layer neural network that takes a compact sketch of the vector as input. Consequently, when a set of sparse binary vectors is approximately separable using a sparse polynomial, there exists a single-layer neural network that takes a short sketch as input and correctly classifies nearly all the points. Previous work has proposed using sketches to reduce dimensionality while preserving the hypothesis class. However, the sketch size has an exponential dependence on the degree in the case of polynomial classifiers. In stark contrast, our approach of using improper learning, using a larger hypothesis class allows the sketch size to have a logarithmic dependence on the degree. Even in the linear case, our approach allows us to improve on the pesky $O({1}/{{\gamma}^2})$ dependence of random projections, on the margin $\gamma$. We empirically show that our approach leads to more compact neural networks than related methods such as feature hashing at equal or better performance.
Tasks
Published 2016-04-19
URL http://arxiv.org/abs/1604.05753v1
PDF http://arxiv.org/pdf/1604.05753v1.pdf
PWC https://paperswithcode.com/paper/sketching-and-neural-networks
Repo
Framework

Higher Order Recurrent Neural Networks

Title Higher Order Recurrent Neural Networks
Authors Rohollah Soltani, Hui Jiang
Abstract In this paper, we study novel neural network structures to better model long term dependency in sequential data. We propose to use more memory units to keep track of more preceding states in recurrent neural networks (RNNs), which are all recurrently fed to the hidden layers as feedback through different weighted paths. By extending the popular recurrent structure in RNNs, we provide the models with better short-term memory mechanism to learn long term dependency in sequences. Analogous to digital filters in signal processing, we call these structures as higher order RNNs (HORNNs). Similar to RNNs, HORNNs can also be learned using the back-propagation through time method. HORNNs are generally applicable to a variety of sequence modelling tasks. In this work, we have examined HORNNs for the language modeling task using two popular data sets, namely the Penn Treebank (PTB) and English text8 data sets. Experimental results have shown that the proposed HORNNs yield the state-of-the-art performance on both data sets, significantly outperforming the regular RNNs as well as the popular LSTMs.
Tasks Language Modelling
Published 2016-04-30
URL http://arxiv.org/abs/1605.00064v1
PDF http://arxiv.org/pdf/1605.00064v1.pdf
PWC https://paperswithcode.com/paper/higher-order-recurrent-neural-networks
Repo
Framework

Compressive Change Retrieval for Moving Object Detection

Title Compressive Change Retrieval for Moving Object Detection
Authors Tomoya Murase, Kanji Tanaka
Abstract Change detection, or anomaly detection, from street-view images acquired by an autonomous robot at multiple different times, is a major problem in robotic mapping and autonomous driving. Formulation as an image comparison task, which operates on a given pair of query and reference images is common to many existing approaches to this problem. Unfortunately, providing relevant reference images is not straightforward. In this paper, we propose a novel formulation for change detection, termed compressive change retrieval, which can operate on a query image and similar reference images retrieved from the web. Compared to previous formulations, there are two sources of difficulty. First, the retrieved reference images may frequently contain non-relevant reference images, because even state-of-the-art place-recognition techniques suffer from retrieval noise. Second, image comparison needs to be conducted in a compressed domain to minimize the storage cost of large collections of street-view images. To address the above issues, we also present a practical change detection algorithm that uses compressed bag-of-words (BoW) image representation as a scalable solution. The results of experiments conducted on a practical change detection task, “moving object detection (MOD),” using the publicly available Malaga dataset validate the effectiveness of the proposed approach.
Tasks Anomaly Detection, Autonomous Driving, Object Detection
Published 2016-08-06
URL http://arxiv.org/abs/1608.02051v1
PDF http://arxiv.org/pdf/1608.02051v1.pdf
PWC https://paperswithcode.com/paper/compressive-change-retrieval-for-moving
Repo
Framework

Image denoising via group sparsity residual constraint

Title Image denoising via group sparsity residual constraint
Authors Zhiyuan Zha, Xin Liu, Ziheng Zhou, Xiaohua Huang, Jingang Shi, Zhenhong Shang, Lan Tang, Yechao Bai, Qiong Wang, Xinggan Zhang
Abstract Group sparsity has shown great potential in various low-level vision tasks (e.g, image denoising, deblurring and inpainting). In this paper, we propose a new prior model for image denoising via group sparsity residual constraint (GSRC). To enhance the performance of group sparse-based image denoising, the concept of group sparsity residual is proposed, and thus, the problem of image denoising is translated into one that reduces the group sparsity residual. To reduce the residual, we first obtain some good estimation of the group sparse coefficients of the original image by the first-pass estimation of noisy image, and then centralize the group sparse coefficients of noisy image to the estimation. Experimental results have demonstrated that the proposed method not only outperforms many state-of-the-art denoising methods such as BM3D and WNNM, but results in a faster speed.
Tasks Deblurring, Denoising, Image Denoising
Published 2016-09-12
URL http://arxiv.org/abs/1609.03302v5
PDF http://arxiv.org/pdf/1609.03302v5.pdf
PWC https://paperswithcode.com/paper/image-denoising-via-group-sparsity-residual
Repo
Framework

Reviving Threshold-Moving: a Simple Plug-in Bagging Ensemble for Binary and Multiclass Imbalanced Data

Title Reviving Threshold-Moving: a Simple Plug-in Bagging Ensemble for Binary and Multiclass Imbalanced Data
Authors Guillem Collell, Drazen Prelec, Kaustubh Patil
Abstract Class imbalance presents a major hurdle in the application of data mining methods. A common practice to deal with it is to create ensembles of classifiers that learn from resampled balanced data. For example, bagged decision trees combined with random undersampling (RUS) or the synthetic minority oversampling technique (SMOTE). However, most of the resampling methods entail asymmetric changes to the examples of different classes, which in turn can introduce its own biases in the model. Furthermore, those methods require a performance measure to be specified a priori before learning. An alternative is to use a so-called threshold-moving method that a posteriori changes the decision threshold of a model to counteract the imbalance, thus has a potential to adapt to the performance measure of interest. Surprisingly, little attention has been paid to the potential of combining bagging ensemble with threshold-moving. In this paper, we present probability thresholding bagging (PT-bagging), a versatile plug-in method that fills this gap. Contrary to usual rebalancing practice, our method preserves the natural class distribution of the data resulting in well calibrated posterior probabilities. We also extend the proposed method to handle multiclass data. The method is validated on binary and multiclass benchmark data sets. We perform analyses that provide insights into the proposed method.
Tasks
Published 2016-06-28
URL http://arxiv.org/abs/1606.08698v3
PDF http://arxiv.org/pdf/1606.08698v3.pdf
PWC https://paperswithcode.com/paper/reviving-threshold-moving-a-simple-plug-in
Repo
Framework

Finding Mirror Symmetry via Registration

Title Finding Mirror Symmetry via Registration
Authors Marcelo Cicconet, David G. C. Hildebrand, Hunter Elliott
Abstract Symmetry is prevalent in nature and a common theme in man-made designs. Both the human visual system and computer vision algorithms can use symmetry to facilitate object recognition and other tasks. Detecting mirror symmetry in images and data is, therefore, useful for a number of applications. Here, we demonstrate that the problem of fitting a plane of mirror symmetry to data in any Euclidian space can be reduced to the problem of registering two datasets. The exactness of the resulting solution depends entirely on the registration accuracy. This new Mirror Symmetry via Registration (MSR) framework involves (1) data reflection with respect to an arbitrary plane, (2) registration of original and reflected datasets, and (3) calculation of the eigenvector of eigenvalue -1 for the transformation matrix representing the reflection and registration mappings. To support MSR, we also introduce a novel 2D registration method based on random sample consensus of an ensemble of normalized cross-correlation matches. With this as its registration back-end, MSR achieves state-of-the-art performance for symmetry line detection in two independent 2D testing databases. We further demonstrate the generality of MSR by testing it on a database of 3D shapes with an iterative closest point registration back-end. Finally, we explore its applicability to examining symmetry in natural systems by assessing the degree of symmetry present in myelinated axon reconstructions from a larval zebrafish.
Tasks Object Recognition
Published 2016-11-18
URL http://arxiv.org/abs/1611.05971v2
PDF http://arxiv.org/pdf/1611.05971v2.pdf
PWC https://paperswithcode.com/paper/finding-mirror-symmetry-via-registration
Repo
Framework

Consensus Based Medical Image Segmentation Using Semi-Supervised Learning And Graph Cuts

Title Consensus Based Medical Image Segmentation Using Semi-Supervised Learning And Graph Cuts
Authors Dwarikanath Mahapatra
Abstract Medical image segmentation requires consensus ground truth segmentations to be derived from multiple expert annotations. A novel approach is proposed that obtains consensus segmentations from experts using graph cuts (GC) and semi supervised learning (SSL). Popular approaches use iterative Expectation Maximization (EM) to estimate the final annotation and quantify annotator’s performance. Such techniques pose the risk of getting trapped in local minima. We propose a self consistency (SC) score to quantify annotator consistency using low level image features. SSL is used to predict missing annotations by considering global features and local image consistency. The SC score also serves as the penalty cost in a second order Markov random field (MRF) cost function optimized using graph cuts to derive the final consensus label. Graph cut obtains a global maximum without an iterative procedure. Experimental results on synthetic images, real data of Crohn’s disease patients and retinal images show our final segmentation to be accurate and more consistent than competing methods.
Tasks Medical Image Segmentation, Semantic Segmentation
Published 2016-12-07
URL http://arxiv.org/abs/1612.02166v3
PDF http://arxiv.org/pdf/1612.02166v3.pdf
PWC https://paperswithcode.com/paper/consensus-based-medical-image-segmentation
Repo
Framework

Fine-grained Recurrent Neural Networks for Automatic Prostate Segmentation in Ultrasound Images

Title Fine-grained Recurrent Neural Networks for Automatic Prostate Segmentation in Ultrasound Images
Authors Xin Yang, Lequan Yu, Lingyun Wu, Yi Wang, Dong Ni, Jing Qin, Pheng-Ann Heng
Abstract Boundary incompleteness raises great challenges to automatic prostate segmentation in ultrasound images. Shape prior can provide strong guidance in estimating the missing boundary, but traditional shape models often suffer from hand-crafted descriptors and local information loss in the fitting procedure. In this paper, we attempt to address those issues with a novel framework. The proposed framework can seamlessly integrate feature extraction and shape prior exploring, and estimate the complete boundary with a sequential manner. Our framework is composed of three key modules. Firstly, we serialize the static 2D prostate ultrasound images into dynamic sequences and then predict prostate shapes by sequentially exploring shape priors. Intuitively, we propose to learn the shape prior with the biologically plausible Recurrent Neural Networks (RNNs). This module is corroborated to be effective in dealing with the boundary incompleteness. Secondly, to alleviate the bias caused by different serialization manners, we propose a multi-view fusion strategy to merge shape predictions obtained from different perspectives. Thirdly, we further implant the RNN core into a multiscale Auto-Context scheme to successively refine the details of the shape prediction map. With extensive validation on challenging prostate ultrasound images, our framework bridges severe boundary incompleteness and achieves the best performance in prostate boundary delineation when compared with several advanced methods. Additionally, our approach is general and can be extended to other medical image segmentation tasks, where boundary incompleteness is one of the main challenges.
Tasks Medical Image Segmentation, Semantic Segmentation
Published 2016-12-06
URL http://arxiv.org/abs/1612.01655v1
PDF http://arxiv.org/pdf/1612.01655v1.pdf
PWC https://paperswithcode.com/paper/fine-grained-recurrent-neural-networks-for
Repo
Framework

Optimal Surface Segmentation with Convex Priors in Irregularly Sampled Space

Title Optimal Surface Segmentation with Convex Priors in Irregularly Sampled Space
Authors Abhay Shah, Michael D. Abramoff, Xiaodong Wu
Abstract Optimal surface segmentation is a state-of-the-art method used for segmentation of multiple globally optimal surfaces in volumetric datasets. The method is widely used in numerous medical image segmentation applications. However, nodes in the graph based optimal surface segmentation method typically encode uniformly distributed orthogonal voxels of the volume. Thus the segmentation cannot attain an accuracy greater than a single unit voxel, i.e. the distance between two adjoining nodes in graph space. Segmentation accuracy higher than a unit voxel is achievable by exploiting partial volume information in the voxels which shall result in non-equidistant spacing between adjoining graph nodes. This paper reports a generalized graph based multiple surface segmentation method with convex priors which can optimally segment the target surfaces in an irregularly sampled space. The proposed method allows non-equidistant spacing between the adjoining graph nodes to achieve subvoxel segmentation accuracy by utilizing the partial volume information in the voxels. The partial volume information in the voxels is exploited by computing a displacement field from the original volume data to identify the subvoxel-accurate centers within each voxel resulting in non-equidistant spacing between the adjoining graph nodes. The smoothness of each surface modeled as a convex constraint governs the connectivity and regularity of the surface. We employ an edge-based graph representation to incorporate the necessary constraints and the globally optimal solution is obtained by computing a minimum s-t cut. The proposed method was validated on 10 intravascular multi-frame ultrasound image datasets for subvoxel segmentation accuracy. In all cases, the approach yielded highly accurate results. Our approach can be readily extended to higher-dimensional segmentations.
Tasks Medical Image Segmentation, Semantic Segmentation, Super-Resolution
Published 2016-11-09
URL http://arxiv.org/abs/1611.03059v3
PDF http://arxiv.org/pdf/1611.03059v3.pdf
PWC https://paperswithcode.com/paper/optimal-surface-segmentation-with-convex
Repo
Framework
comments powered by Disqus