May 5, 2019

2824 words 14 mins read

Paper Group ANR 509

Paper Group ANR 509

The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection. Super-resolution Reconstruction of SAR Image based on Non-Local Means Denoising Combined with BP Neural Network. Construction of extended 3D field of views of the internal bladder wall surface: a proof of concept. Constraint matrix factorization for space variant PSFs field r …

The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection

Title The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection
Authors Pascal Mettes, Dennis C. Koelma, Cees G. M. Snoek
Abstract This paper strives for video event detection using a representation learned from deep convolutional neural networks. Different from the leading approaches, who all learn from the 1,000 classes defined in the ImageNet Large Scale Visual Recognition Challenge, we investigate how to leverage the complete ImageNet hierarchy for pre-training deep networks. To deal with the problems of over-specific classes and classes with few images, we introduce a bottom-up and top-down approach for reorganization of the ImageNet hierarchy based on all its 21,814 classes and more than 14 million images. Experiments on the TRECVID Multimedia Event Detection 2013 and 2015 datasets show that video representations derived from the layers of a deep neural network pre-trained with our reorganized hierarchy i) improves over standard pre-training, ii) is complementary among different reorganizations, iii) maintains the benefits of fusion with other modalities, and iv) leads to state-of-the-art event detection results. The reorganized hierarchies and their derived Caffe models are publicly available at http://tinyurl.com/imagenetshuffle.
Tasks Object Recognition
Published 2016-02-23
URL http://arxiv.org/abs/1602.07119v1
PDF http://arxiv.org/pdf/1602.07119v1.pdf
PWC https://paperswithcode.com/paper/the-imagenet-shuffle-reorganized-pre-training
Repo
Framework

Super-resolution Reconstruction of SAR Image based on Non-Local Means Denoising Combined with BP Neural Network

Title Super-resolution Reconstruction of SAR Image based on Non-Local Means Denoising Combined with BP Neural Network
Authors Zeling Wu, Haoxiang Wang
Abstract In this article, we propose a super-resolution method to resolve the problem of image low spatial because of the limitation of imaging devices. We make use of the strong non-linearity mapped ability of the back-propagation neural networks(BPNN). Training sample images are got by undersampled method. The elements chose as the inputs of the BPNN are pixels referred to Non-local means(NL-Means). Making use of the self-similarity of the images, those inputs are the pixels which are pixels gained from modified NL-means which is specific for super-resolution. Besides, small change on core function of NL-means has been applied in the method we use in this article so that we can have a clearer edge in the shrunk image. Experimental results gained from the Peak Signal to Noise Ratio(PSNR) and the Equivalent Number of Look(ENL), indicate that adding the similar pixels as inputs will increase the results than not taking them into consideration.
Tasks Denoising, Super-Resolution
Published 2016-12-14
URL http://arxiv.org/abs/1612.04755v1
PDF http://arxiv.org/pdf/1612.04755v1.pdf
PWC https://paperswithcode.com/paper/super-resolution-reconstruction-of-sar-image
Repo
Framework

Construction of extended 3D field of views of the internal bladder wall surface: a proof of concept

Title Construction of extended 3D field of views of the internal bladder wall surface: a proof of concept
Authors Achraf Ben-Hamadou, Christian Daul, Charles Soussen
Abstract 3D extended field of views (FOVs) of the internal bladder wall facilitate lesion diagnosis, patient follow-up and treatment traceability. In this paper, we propose a 3D image mosaicing algorithm guided by 2D cystoscopic video-image registration for obtaining textured FOV mosaics. In this feasibility study, the registration makes use of data from a 3D cystoscope prototype providing, in addition to each small FOV image, some 3D points located on the surface. This proof of concept shows that textured surfaces can be constructed with minimally modified cystoscopes. The potential of the method is demonstrated on numerical and real phantoms reproducing various surface shapes. Pig and human bladder textures are superimposed on phantoms with known shape and dimensions. These data allow for quantitative assessment of the 3D mosaicing algorithm based on the registration of images simulating bladder textures.
Tasks Image Registration
Published 2016-07-16
URL http://arxiv.org/abs/1607.04773v1
PDF http://arxiv.org/pdf/1607.04773v1.pdf
PWC https://paperswithcode.com/paper/construction-of-extended-3d-field-of-views-of
Repo
Framework

Constraint matrix factorization for space variant PSFs field restoration

Title Constraint matrix factorization for space variant PSFs field restoration
Authors F. M. Ngolè Mboula, J. -L. Starck, K. Okumura, J. Amiaux, P. Hudelot
Abstract Context: in large-scale spatial surveys, the Point Spread Function (PSF) varies across the instrument field of view (FOV). Local measurements of the PSFs are given by the isolated stars images. Yet, these estimates may not be directly usable for post-processings because of the observational noise and potentially the aliasing. Aims: given a set of aliased and noisy stars images from a telescope, we want to estimate well-resolved and noise-free PSFs at the observed stars positions, in particular, exploiting the spatial correlation of the PSFs across the FOV. Contributions: we introduce RCA (Resolved Components Analysis) which is a noise-robust dimension reduction and super-resolution method based on matrix factorization. We propose an original way of using the PSFs spatial correlation in the restoration process through sparsity. The introduced formalism can be applied to correlated data sets with respect to any euclidean parametric space. Results: we tested our method on simulated monochromatic PSFs of Euclid telescope (launch planned for 2020). The proposed method outperforms existing PSFs restoration and dimension reduction methods. We show that a coupled sparsity constraint on individual PSFs and their spatial distribution yields a significant improvement on both the restored PSFs shapes and the PSFs subspace identification, in presence of aliasing. Perspectives: RCA can be naturally extended to account for the wavelength dependency of the PSFs.
Tasks Dimensionality Reduction, Super-Resolution
Published 2016-08-29
URL http://arxiv.org/abs/1608.08104v3
PDF http://arxiv.org/pdf/1608.08104v3.pdf
PWC https://paperswithcode.com/paper/constraint-matrix-factorization-for-space
Repo
Framework

Reinterpreting the Transformation Posterior in Probabilistic Image Registration

Title Reinterpreting the Transformation Posterior in Probabilistic Image Registration
Authors Jie Luo, Karteek Popuri, Dana Cobzas, Hongyi Ding, Masashi Sugiyama
Abstract Probabilistic image registration methods estimate the posterior distribution of transformation. The conventional way of interpreting the transformation posterior is to use the mode as the most likely transformation and assign its corresponding intensity to the registered voxel. Meanwhile, summary statistics of the posterior are employed to evaluate the registration uncertainty, that is the trustworthiness of the registered image. Despite the wide acceptance, this convention has never been justified. In this paper, based on illustrative examples, we question the correctness and usefulness of conventional methods. In order to faithfully translate the transformation posterior, we propose to encode the variability of values into a novel data type called ensemble fields. Ensemble fields can serve as a complement to the registered image and a foundation for developing advanced methods to characterize the uncertainty in registration-based tasks. We demonstrate the potential of ensemble fields by pilot examples
Tasks Image Registration
Published 2016-04-07
URL http://arxiv.org/abs/1604.01889v1
PDF http://arxiv.org/pdf/1604.01889v1.pdf
PWC https://paperswithcode.com/paper/reinterpreting-the-transformation-posterior
Repo
Framework

Fundamental Limits in Multi-image Alignment

Title Fundamental Limits in Multi-image Alignment
Authors Cecilia Aguerrebere, Mauricio Delbracio, Alberto Bartesaghi, Guillermo Sapiro
Abstract The performance of multi-image alignment, bringing different images into one coordinate system, is critical in many applications with varied signal-to-noise ratio (SNR) conditions. A great amount of effort is being invested into developing methods to solve this problem. Several important questions thus arise, including: Which are the fundamental limits in multi-image alignment performance? Does having access to more images improve the alignment? Theoretical bounds provide a fundamental benchmark to compare methods and can help establish whether improvements can be made. In this work, we tackle the problem of finding the performance limits in image registration when multiple shifted and noisy observations are available. We derive and analyze the Cram'er-Rao and Ziv-Zakai lower bounds under different statistical models for the underlying image. The accuracy of the derived bounds is experimentally assessed through a comparison to the maximum likelihood estimator. We show the existence of different behavior zones depending on the difficulty level of the problem, given by the SNR conditions of the input images. We find that increasing the number of images is only useful below a certain SNR threshold, above which the pairwise MLE estimation proves to be optimal. The analysis we present here brings further insight into the fundamental limitations of the multi-image alignment problem.
Tasks Image Registration
Published 2016-02-04
URL http://arxiv.org/abs/1602.01541v1
PDF http://arxiv.org/pdf/1602.01541v1.pdf
PWC https://paperswithcode.com/paper/fundamental-limits-in-multi-image-alignment
Repo
Framework

Inferring Logical Forms From Denotations

Title Inferring Logical Forms From Denotations
Authors Panupong Pasupat, Percy Liang
Abstract A core problem in learning semantic parsers from denotations is picking out consistent logical forms–those that yield the correct denotation–from a combinatorially large space. To control the search space, previous work relied on restricted set of rules, which limits expressivity. In this paper, we consider a much more expressive class of logical forms, and show how to use dynamic programming to efficiently represent the complete set of consistent logical forms. Expressivity also introduces many more spurious logical forms which are consistent with the correct denotation but do not represent the meaning of the utterance. To address this, we generate fictitious worlds and use crowdsourced denotations on these worlds to filter out spurious logical forms. On the WikiTableQuestions dataset, we increase the coverage of answerable questions from 53.5% to 76%, and the additional crowdsourced supervision lets us rule out 92.1% of spurious logical forms.
Tasks
Published 2016-06-22
URL http://arxiv.org/abs/1606.06900v2
PDF http://arxiv.org/pdf/1606.06900v2.pdf
PWC https://paperswithcode.com/paper/inferring-logical-forms-from-denotations
Repo
Framework

Semi-supervised Clustering for Short Text via Deep Representation Learning

Title Semi-supervised Clustering for Short Text via Deep Representation Learning
Authors Zhiguo Wang, Haitao Mi, Abraham Ittycheriah
Abstract In this work, we propose a semi-supervised method for short text clustering, where we represent texts as distributed vectors with neural networks, and use a small amount of labeled data to specify our intention for clustering. We design a novel objective to combine the representation learning process and the k-means clustering process together, and optimize the objective with both labeled data and unlabeled data iteratively until convergence through three steps: (1) assign each short text to its nearest centroid based on its representation from the current neural networks; (2) re-estimate the cluster centroids based on cluster assignments from step (1); (3) update neural networks according to the objective by keeping centroids and cluster assignments fixed. Experimental results on four datasets show that our method works significantly better than several other text clustering methods.
Tasks Representation Learning, Text Clustering
Published 2016-02-22
URL http://arxiv.org/abs/1602.06797v2
PDF http://arxiv.org/pdf/1602.06797v2.pdf
PWC https://paperswithcode.com/paper/semi-supervised-clustering-for-short-text-via
Repo
Framework

Divisive-agglomerative algorithm and complexity of automatic classification problems

Title Divisive-agglomerative algorithm and complexity of automatic classification problems
Authors Alexander Rubchinsky
Abstract An algorithm of solution of the Automatic Classification (AC for brevity) problem is set forth in the paper. In the AC problem, it is required to find one or several artitions, starting with the given pattern matrix or dissimilarity, similarity matrix.
Tasks
Published 2016-07-05
URL http://arxiv.org/abs/1607.02419v1
PDF http://arxiv.org/pdf/1607.02419v1.pdf
PWC https://paperswithcode.com/paper/divisive-agglomerative-algorithm-and
Repo
Framework

Refining Architectures of Deep Convolutional Neural Networks

Title Refining Architectures of Deep Convolutional Neural Networks
Authors Sukrit Shankar, Duncan Robertson, Yani Ioannou, Antonio Criminisi, Roberto Cipolla
Abstract Deep Convolutional Neural Networks (CNNs) have recently evinced immense success for various image recognition tasks. However, a question of paramount importance is somewhat unanswered in deep learning research - is the selected CNN optimal for the dataset in terms of accuracy and model size? In this paper, we intend to answer this question and introduce a novel strategy that alters the architecture of a given CNN for a specified dataset, to potentially enhance the original accuracy while possibly reducing the model size. We use two operations for architecture refinement, viz. stretching and symmetrical splitting. Our procedure starts with a pre-trained CNN for a given dataset, and optimally decides the stretch and split factors across the network to refine the architecture. We empirically demonstrate the necessity of the two operations. We evaluate our approach on two natural scenes attributes datasets, SUN Attributes and CAMIT-NSAD, with architectures of GoogleNet and VGG-11, that are quite contrasting in their construction. We justify our choice of datasets, and show that they are interestingly distinct from each other, and together pose a challenge to our architectural refinement algorithm. Our results substantiate the usefulness of the proposed method.
Tasks
Published 2016-04-22
URL http://arxiv.org/abs/1604.06832v1
PDF http://arxiv.org/pdf/1604.06832v1.pdf
PWC https://paperswithcode.com/paper/refining-architectures-of-deep-convolutional
Repo
Framework

Fusing Deep Convolutional Networks for Large Scale Visual Concept Classification

Title Fusing Deep Convolutional Networks for Large Scale Visual Concept Classification
Authors Hilal Ergun, Mustafa Sert
Abstract Deep learning architectures are showing great promise in various computer vision domains including image classification, object detection, event detection and action recognition. In this study, we investigate various aspects of convolutional neural networks (CNNs) from the big data perspective. We analyze recent studies and different network architectures both in terms of running time and accuracy. We present extensive empirical information along with best practices for big data practitioners. Using these best practices we propose efficient fusion mechanisms both for single and multiple network models. We present state-of-the art results on benchmark datasets while keeping computational costs at a lower level. Another contribution of our paper is that these state-of-the-art results can be reached without using extensive data augmentation techniques.
Tasks Data Augmentation, Image Classification, Object Detection, Temporal Action Localization
Published 2016-08-05
URL http://arxiv.org/abs/1608.01866v1
PDF http://arxiv.org/pdf/1608.01866v1.pdf
PWC https://paperswithcode.com/paper/fusing-deep-convolutional-networks-for-large
Repo
Framework

The More You Know: Using Knowledge Graphs for Image Classification

Title The More You Know: Using Knowledge Graphs for Image Classification
Authors Kenneth Marino, Ruslan Salakhutdinov, Abhinav Gupta
Abstract One characteristic that sets humans apart from modern learning-based computer vision algorithms is the ability to acquire knowledge about the world and use that knowledge to reason about the visual world. Humans can learn about the characteristics of objects and the relationships that occur between them to learn a large variety of visual concepts, often with few examples. This paper investigates the use of structured prior knowledge in the form of knowledge graphs and shows that using this knowledge improves performance on image classification. We build on recent work on end-to-end learning on graphs, introducing the Graph Search Neural Network as a way of efficiently incorporating large knowledge graphs into a vision classification pipeline. We show in a number of experiments that our method outperforms standard neural network baselines for multi-label classification.
Tasks Image Classification, Knowledge Graphs, Multi-Label Classification
Published 2016-12-14
URL http://arxiv.org/abs/1612.04844v2
PDF http://arxiv.org/pdf/1612.04844v2.pdf
PWC https://paperswithcode.com/paper/the-more-you-know-using-knowledge-graphs-for
Repo
Framework

Mutual information for fitting deep nonlinear models

Title Mutual information for fitting deep nonlinear models
Authors Jacob S. Hunter, Nathan O. Hodas
Abstract Deep nonlinear models pose a challenge for fitting parameters due to lack of knowledge of the hidden layer and the potentially non-affine relation of the initial and observed layers. In the present work we investigate the use of information theoretic measures such as mutual information and Kullback-Leibler (KL) divergence as objective functions for fitting such models without knowledge of the hidden layer. We investigate one model as a proof of concept and one application of cogntive performance. We further investigate the use of optimizers with these methods. Mutual information is largely successful as an objective, depending on the parameters. KL divergence is found to be similarly succesful, given some knowledge of the statistics of the hidden layer.
Tasks
Published 2016-12-17
URL http://arxiv.org/abs/1612.05708v1
PDF http://arxiv.org/pdf/1612.05708v1.pdf
PWC https://paperswithcode.com/paper/mutual-information-for-fitting-deep-nonlinear
Repo
Framework

Visual Tracking via Boolean Map Representations

Title Visual Tracking via Boolean Map Representations
Authors Kaihua Zhang, Qingshan Liu, Ming-Hsuan Yang
Abstract In this paper, we present a simple yet effective Boolean map based representation that exploits connectivity cues for visual tracking. We describe a target object with histogram of oriented gradients and raw color features, of which each one is characterized by a set of Boolean maps generated by uniformly thresholding their values. The Boolean maps effectively encode multi-scale connectivity cues of the target with different granularities. The fine-grained Boolean maps capture spatially structural details that are effective for precise target localization while the coarse-grained ones encode global shape information that are robust to large target appearance variations. Finally, all the Boolean maps form together a robust representation that can be approximated by an explicit feature map of the intersection kernel, which is fed into a logistic regression classifier with online update, and the target location is estimated within a particle filter framework. The proposed representation scheme is computationally efficient and facilitates achieving favorable performance in terms of accuracy and robustness against the state-of-the-art tracking methods on a large benchmark dataset of 50 image sequences.
Tasks Visual Tracking
Published 2016-10-30
URL http://arxiv.org/abs/1610.09652v1
PDF http://arxiv.org/pdf/1610.09652v1.pdf
PWC https://paperswithcode.com/paper/visual-tracking-via-boolean-map
Repo
Framework

A New Method to Visualize Deep Neural Networks

Title A New Method to Visualize Deep Neural Networks
Authors Luisa M. Zintgraf, Taco S. Cohen, Max Welling
Abstract We present a method for visualising the response of a deep neural network to a specific input. For image data for instance our method will highlight areas that provide evidence in favor of, and against choosing a certain class. The method overcomes several shortcomings of previous methods and provides great additional insight into the decision making process of convolutional networks, which is important both to improve models and to accelerate the adoption of such methods in e.g. medicine. In experiments on ImageNet data, we illustrate how the method works and can be applied in different ways to understand deep neural nets.
Tasks Decision Making
Published 2016-03-08
URL http://arxiv.org/abs/1603.02518v3
PDF http://arxiv.org/pdf/1603.02518v3.pdf
PWC https://paperswithcode.com/paper/a-new-method-to-visualize-deep-neural
Repo
Framework
comments powered by Disqus