May 7, 2019

3142 words 15 mins read

Paper Group ANR 117

Stacked Autoencoders for Medical Image Search. Unsupervised Non Linear Dimensionality Reduction Machine Learning methods applied to Multiparametric MRI in cerebral ischemia: Preliminary Results. Visual Relationship Detection with Language Priors. Optimizing Top Precision Performance Measure of Content-Based Image Retrieval by Learning Similarity Fu …

Stacked Autoencoders for Medical Image Search


Title	Stacked Autoencoders for Medical Image Search
Authors	S. Sharma, I. Umar, L. Ospina, D. Wong, H. R. Tizhoosh
Abstract	Medical images can be a valuable resource for reliable information to support medical diagnosis. However, the large volume of medical images makes it challenging to retrieve relevant information given a particular scenario. To solve this challenge, content-based image retrieval (CBIR) attempts to characterize images (or image regions) with invariant content information in order to facilitate image search. This work presents a feature extraction technique for medical images using stacked autoencoders, which encode images to binary vectors. The technique is applied to the IRMA dataset, a collection of 14,410 x-ray images in order to demonstrate the ability of autoencoders to retrieve similar x-rays given test queries. Using IRMA dataset as a benchmark, it was found that stacked autoencoders gave excellent results with a retrieval error of 376 for 1,733 test images with a compression of 74.61%.
Tasks	Content-Based Image Retrieval, Image Retrieval, Medical Diagnosis
Published	2016-10-02
URL	http://arxiv.org/abs/1610.00320v1
PDF	http://arxiv.org/pdf/1610.00320v1.pdf
PWC	https://paperswithcode.com/paper/stacked-autoencoders-for-medical-image-search
Repo
Framework

Unsupervised Non Linear Dimensionality Reduction Machine Learning methods applied to Multiparametric MRI in cerebral ischemia: Preliminary Results


Title	Unsupervised Non Linear Dimensionality Reduction Machine Learning methods applied to Multiparametric MRI in cerebral ischemia: Preliminary Results
Authors	Vishwa S. Parekh, Jeremy R. Jacobs, Michael A. Jacobs
Abstract	The evaluation and treatment of acute cerebral ischemia requires a technique that can determine the total area of tissue at risk for infarction using diagnostic magnetic resonance imaging (MRI) sequences. Typical MRI data sets consist of T1- and T2-weighted imaging (T1WI, T2WI) along with advanced MRI parameters of diffusion-weighted imaging (DWI) and perfusion weighted imaging (PWI) methods. Each of these parameters has distinct radiological-pathological meaning. For example, DWI interrogates the movement of water in the tissue and PWI gives an estimate of the blood flow, both are critical measures during the evolution of stroke. In order to integrate these data and give an estimate of the tissue at risk or damaged, we have developed advanced machine learning methods based on unsupervised non-linear dimensionality reduction (NLDR) techniques. NLDR methods are a class of algorithms that uses mathematically defined manifolds for statistical sampling of multidimensional classes to generate a discrimination rule of guaranteed statistical accuracy and they can generate a two- or three-dimensional map, which represents the prominent structures of the data and provides an embedded image of meaningful low-dimensional structures hidden in their high-dimensional observations. In this manuscript, we develop NLDR methods on high dimensional MRI data sets of preclinical animals and clinical patients with stroke. On analyzing the performance of these methods, we observed that there was a high of similarity between multiparametric embedded images from NLDR methods and the ADC map and perfusion map. It was also observed that embedded scattergram of abnormal (infarcted or at risk) tissue can be visualized and provides a mechanism for automatic methods to delineate potential stroke volumes and early tissue at risk.
Tasks	Dimensionality Reduction
Published	2016-06-13
URL	http://arxiv.org/abs/1606.03788v1
PDF	http://arxiv.org/pdf/1606.03788v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-non-linear-dimensionality
Repo
Framework

Visual Relationship Detection with Language Priors


Title	Visual Relationship Detection with Language Priors
Authors	Cewu Lu, Ranjay Krishna, Michael Bernstein, Li Fei-Fei
Abstract	Visual relationships capture a wide variety of interactions between pairs of objects in images (e.g. “man riding bicycle” and “man pushing bicycle”). Consequently, the set of possible relationships is extremely large and it is difficult to obtain sufficient training examples for all possible relationships. Because of this limitation, previous work on visual relationship detection has concentrated on predicting only a handful of relationships. Though most relationships are infrequent, their objects (e.g. “man” and “bicycle”) and predicates (e.g. “riding” and “pushing”) independently occur more frequently. We propose a model that uses this insight to train visual models for objects and predicates individually and later combines them together to predict multiple relationships per image. We improve on prior work by leveraging language priors from semantic word embeddings to finetune the likelihood of a predicted relationship. Our model can scale to predict thousands of types of relationships from a few examples. Additionally, we localize the objects in the predicted relationships as bounding boxes in the image. We further demonstrate that understanding relationships can improve content based image retrieval.
Tasks	Content-Based Image Retrieval, Image Retrieval, Word Embeddings
Published	2016-07-31
URL	http://arxiv.org/abs/1608.00187v1
PDF	http://arxiv.org/pdf/1608.00187v1.pdf
PWC	https://paperswithcode.com/paper/visual-relationship-detection-with-language
Repo
Framework

Optimizing Top Precision Performance Measure of Content-Based Image Retrieval by Learning Similarity Function


Title	Optimizing Top Precision Performance Measure of Content-Based Image Retrieval by Learning Similarity Function
Authors	Ru-Ze Liang, Lihui Shi, Haoxiang Wang, Jiandong Meng, Jim Jing-Yan Wang, Qingquan Sun, Yi Gu
Abstract	In this paper we study the problem of content-based image retrieval. In this problem, the most popular performance measure is the top precision measure, and the most important component of a retrieval system is the similarity function used to compare a query image against a database image. However, up to now, there is no existing similarity learning method proposed to optimize the top precision measure. To fill this gap, in this paper, we propose a novel similarity learning method to maximize the top precision measure. We model this problem as a minimization problem with an objective function as the combination of the losses of the relevant images ranked behind the top-ranked irrelevant image, and the squared Frobenius norm of the similarity function parameter. This minimization problem is solved as a quadratic programming problem. The experiments over two benchmark data sets show the advantages of the proposed method over other similarity learning methods when the top precision is used as the performance measure.
Tasks	Content-Based Image Retrieval, Image Retrieval
Published	2016-04-22
URL	http://arxiv.org/abs/1604.06620v5
PDF	http://arxiv.org/pdf/1604.06620v5.pdf
PWC	https://paperswithcode.com/paper/optimizing-top-precision-performance-measure
Repo
Framework

Radon Features and Barcodes for Medical Image Retrieval via SVM


Title	Radon Features and Barcodes for Medical Image Retrieval via SVM
Authors	Shujin Zhu, H. R. Tizhoosh
Abstract	For more than two decades, research has been performed on content-based image retrieval (CBIR). By combining Radon projections and the support vector machines (SVM), a content-based medical image retrieval method is presented in this work. The proposed approach employs the normalized Radon projections with corresponding image category labels to build an SVM classifier, and the Radon barcode database which encodes every image in a binary format is also generated simultaneously to tag all images. To retrieve similar images when a query image is given, Radon projections and the barcode of the query image are generated. Subsequently, the k-nearest neighbor search method is applied to find the images with minimum Hamming distance of the Radon barcode within the same class predicted by the trained SVM classifier that uses Radon features. The performance of the proposed method is validated by using the IRMA 2009 dataset with 14,410 x-ray images in 57 categories. The results demonstrate that our method has the capacity to retrieve similar responses for the correctly identified query image and even for those mistakenly classified by SVM. The approach further is very fast and has low memory requirement.
Tasks	Content-Based Image Retrieval, Image Retrieval, Medical Image Retrieval
Published	2016-04-16
URL	http://arxiv.org/abs/1604.04675v1
PDF	http://arxiv.org/pdf/1604.04675v1.pdf
PWC	https://paperswithcode.com/paper/radon-features-and-barcodes-for-medical-image
Repo
Framework

Scalable Image Retrieval by Sparse Product Quantization


Title	Scalable Image Retrieval by Sparse Product Quantization
Authors	Qingqun Ning, Jianke Zhu, Zhiyuan Zhong, Steven C. H. Hoi, Chun Chen
Abstract	Fast Approximate Nearest Neighbor (ANN) search technique for high-dimensional feature indexing and retrieval is the crux of large-scale image retrieval. A recent promising technique is Product Quantization, which attempts to index high-dimensional image features by decomposing the feature space into a Cartesian product of low dimensional subspaces and quantizing each of them separately. Despite the promising results reported, their quantization approach follows the typical hard assignment of traditional quantization methods, which may result in large quantization errors and thus inferior search performance. Unlike the existing approaches, in this paper, we propose a novel approach called Sparse Product Quantization (SPQ) to encoding the high-dimensional feature vectors into sparse representation. We optimize the sparse representations of the feature vectors by minimizing their quantization errors, making the resulting representation is essentially close to the original data in practice. Experiments show that the proposed SPQ technique is not only able to compress data, but also an effective encoding technique. We obtain state-of-the-art results for ANN search on four public image datasets and the promising results of content-based image retrieval further validate the efficacy of our proposed method.
Tasks	Content-Based Image Retrieval, Image Retrieval, Quantization
Published	2016-03-15
URL	http://arxiv.org/abs/1603.04614v1
PDF	http://arxiv.org/pdf/1603.04614v1.pdf
PWC	https://paperswithcode.com/paper/scalable-image-retrieval-by-sparse-product
Repo
Framework

Visual descriptors for content-based retrieval of remote sensing images


Title	Visual descriptors for content-based retrieval of remote sensing images
Authors	Paolo Napoletano
Abstract	In this paper we present an extensive evaluation of visual descriptors for the content-based retrieval of remote sensing (RS) images. The evaluation includes global hand-crafted, local hand-crafted, and Convolutional Neural Network (CNNs) features coupled with four different Content-Based Image Retrieval schemes. We conducted all the experiments on two publicly available datasets: the 21-class UC Merced Land Use/Land Cover (LandUse) dataset and 19-class High-resolution Satellite Scene dataset (SceneSat). The content of RS images might be quite heterogeneous, ranging from images containing fine grained textures, to coarse grained ones or to images containing objects. It is therefore not obvious in this domain, which descriptor should be employed to describe images having such a variability. Results demonstrate that CNN-based features perform better than both global and and local hand-crafted features whatever is the retrieval scheme adopted. Features extracted from SatResNet-50, a residual CNN suitable fine-tuned on the RS domain, shows much better performance than a residual CNN pre-trained on multimedia scene and object images. Features extracted from NetVLAD, a CNN that considers both CNN and local features, works better than others CNN solutions on those images that contain fine-grained textures and objects.
Tasks	Content-Based Image Retrieval, Image Retrieval
Published	2016-02-02
URL	http://arxiv.org/abs/1602.00970v5
PDF	http://arxiv.org/pdf/1602.00970v5.pdf
PWC	https://paperswithcode.com/paper/visual-descriptors-for-content-based
Repo
Framework

Programs as Black-Box Explanations


Title	Programs as Black-Box Explanations
Authors	Sameer Singh, Marco Tulio Ribeiro, Carlos Guestrin
Abstract	Recent work in model-agnostic explanations of black-box machine learning has demonstrated that interpretability of complex models does not have to come at the cost of accuracy or model flexibility. However, it is not clear what kind of explanations, such as linear models, decision trees, and rule lists, are the appropriate family to consider, and different tasks and models may benefit from different kinds of explanations. Instead of picking a single family of representations, in this work we propose to use “programs” as model-agnostic explanations. We show that small programs can be expressive yet intuitive as explanations, and generalize over a number of existing interpretable families. We propose a prototype program induction method based on simulated annealing that approximates the local behavior of black-box classifiers around a specific prediction using random perturbations. Finally, we present preliminary application on small datasets and show that the generated explanations are intuitive and accurate for a number of classifiers.
Tasks
Published	2016-11-22
URL	http://arxiv.org/abs/1611.07579v1
PDF	http://arxiv.org/pdf/1611.07579v1.pdf
PWC	https://paperswithcode.com/paper/programs-as-black-box-explanations
Repo
Framework

Clustering from Sparse Pairwise Measurements


Title	Clustering from Sparse Pairwise Measurements
Authors	Alaa Saade, Marc Lelarge, Florent Krzakala, Lenka Zdeborová
Abstract	We consider the problem of grouping items into clusters based on few random pairwise comparisons between the items. We introduce three closely related algorithms for this task: a belief propagation algorithm approximating the Bayes optimal solution, and two spectral algorithms based on the non-backtracking and Bethe Hessian operators. For the case of two symmetric clusters, we conjecture that these algorithms are asymptotically optimal in that they detect the clusters as soon as it is information theoretically possible to do so. We substantiate this claim for one of the spectral approaches we introduce.
Tasks
Published	2016-01-25
URL	http://arxiv.org/abs/1601.06683v2
PDF	http://arxiv.org/pdf/1601.06683v2.pdf
PWC	https://paperswithcode.com/paper/clustering-from-sparse-pairwise-measurements
Repo
Framework

Micro-Data Learning: The Other End of the Spectrum


Title	Micro-Data Learning: The Other End of the Spectrum
Authors	Jean-Baptiste Mouret
Abstract	Many fields are now snowed under with an avalanche of data, which raises considerable challenges for computer scientists. Meanwhile, robotics (among other fields) can often only use a few dozen data points because acquiring them involves a process that is expensive or time-consuming. How can an algorithm learn with only a few data points?
Tasks
Published	2016-10-04
URL	http://arxiv.org/abs/1610.00946v1
PDF	http://arxiv.org/pdf/1610.00946v1.pdf
PWC	https://paperswithcode.com/paper/micro-data-learning-the-other-end-of-the
Repo
Framework

Kernel Ridge Regression via Partitioning


Title	Kernel Ridge Regression via Partitioning
Authors	Rashish Tandon, Si Si, Pradeep Ravikumar, Inderjit Dhillon
Abstract	In this paper, we investigate a divide and conquer approach to Kernel Ridge Regression (KRR). Given n samples, the division step involves separating the points based on some underlying disjoint partition of the input space (possibly via clustering), and then computing a KRR estimate for each partition. The conquering step is simple: for each partition, we only consider its own local estimate for prediction. We establish conditions under which we can give generalization bounds for this estimator, as well as achieve optimal minimax rates. We also show that the approximation error component of the generalization error is lesser than when a single KRR estimate is fit on the data: thus providing both statistical and computational advantages over a single KRR estimate over the entire data (or an averaging over random partitions as in other recent work, [30]). Lastly, we provide experimental validation for our proposed estimator and our assumptions.
Tasks
Published	2016-08-05
URL	http://arxiv.org/abs/1608.01976v1
PDF	http://arxiv.org/pdf/1608.01976v1.pdf
PWC	https://paperswithcode.com/paper/kernel-ridge-regression-via-partitioning
Repo
Framework

Visual-Inertial-Semantic Scene Representation for 3-D Object Detection


Title	Visual-Inertial-Semantic Scene Representation for 3-D Object Detection
Authors	Jingming Dong, Xiaohan Fei, Stefano Soatto
Abstract	We describe a system to detect objects in three-dimensional space using video and inertial sensors (accelerometer and gyrometer), ubiquitous in modern mobile platforms from phones to drones. Inertials afford the ability to impose class-specific scale priors for objects, and provide a global orientation reference. A minimal sufficient representation, the posterior of semantic (identity) and syntactic (pose) attributes of objects in space, can be decomposed into a geometric term, which can be maintained by a localization-and-mapping filter, and a likelihood function, which can be approximated by a discriminatively-trained convolutional neural network. The resulting system can process the video stream causally in real time, and provides a representation of objects in the scene that is persistent: Confidence in the presence of objects grows with evidence, and objects previously seen are kept in memory even when temporarily occluded, with their return into view automatically predicted to prime re-detection.
Tasks	Object Detection
Published	2016-06-13
URL	http://arxiv.org/abs/1606.03968v2
PDF	http://arxiv.org/pdf/1606.03968v2.pdf
PWC	https://paperswithcode.com/paper/visual-inertial-semantic-scene-representation
Repo
Framework

Exploiting Semantic Information and Deep Matching for Optical Flow


Title	Exploiting Semantic Information and Deep Matching for Optical Flow
Authors	Min Bai, Wenjie Luo, Kaustav Kundu, Raquel Urtasun
Abstract	We tackle the problem of estimating optical flow from a monocular camera in the context of autonomous driving. We build on the observation that the scene is typically composed of a static background, as well as a relatively small number of traffic participants which move rigidly in 3D. We propose to estimate the traffic participants using instance-level segmentation. For each traffic participant, we use the epipolar constraints that govern each independent motion for faster and more accurate estimation. Our second contribution is a new convolutional net that learns to perform flow matching, and is able to estimate the uncertainty of its matches. This is a core element of our flow estimation pipeline. We demonstrate the effectiveness of our approach in the challenging KITTI 2015 flow benchmark, and show that our approach outperforms published approaches by a large margin.
Tasks	Autonomous Driving, Optical Flow Estimation
Published	2016-04-06
URL	http://arxiv.org/abs/1604.01827v2
PDF	http://arxiv.org/pdf/1604.01827v2.pdf
PWC	https://paperswithcode.com/paper/exploiting-semantic-information-and-deep
Repo
Framework

Exploring the Neural Algorithm of Artistic Style


Title	Exploring the Neural Algorithm of Artistic Style
Authors	Yaroslav Nikulin, Roman Novak
Abstract	We explore the method of style transfer presented in the article “A Neural Algorithm of Artistic Style” by Leon A. Gatys, Alexander S. Ecker and Matthias Bethge (arXiv:1508.06576). We first demonstrate the power of the suggested style space on a few examples. We then vary different hyper-parameters and program properties that were not discussed in the original paper, among which are the recognition network used, starting point of the gradient descent and different ways to partition style and content layers. We also give a brief comparison of some of the existing algorithm implementations and deep learning frameworks used. To study the style space further we attempt to generate synthetic images by maximizing a single entry in one of the Gram matrices $\mathcal{G}_l$ and some interesting results are observed. Next, we try to mimic the sparsity and intensity distribution of Gram matrices obtained from a real painting and generate more complex textures. Finally, we propose two new style representations built on top of network’s features and discuss how one could be used to achieve local and potentially content-aware style transfer.
Tasks	Style Transfer
Published	2016-02-23
URL	http://arxiv.org/abs/1602.07188v2
PDF	http://arxiv.org/pdf/1602.07188v2.pdf
PWC	https://paperswithcode.com/paper/exploring-the-neural-algorithm-of-artistic
Repo
Framework

Training of spiking neural networks based on information theoretic costs


Title	Training of spiking neural networks based on information theoretic costs
Authors	Oleg Y. Sinyavskiy
Abstract	Spiking neural network is a type of artificial neural network in which neurons communicate between each other with spikes. Spikes are identical Boolean events characterized by the time of their arrival. A spiking neuron has internal dynamics and responds to the history of inputs as opposed to the current inputs only. Because of such properties a spiking neural network has rich intrinsic capabilities to process spatiotemporal data. However, because the spikes are discontinuous ‘yes or no’ events, it is not trivial to apply traditional training procedures such as gradient descend to the spiking neurons. In this thesis we propose to use stochastic spiking neuron models in which probability of a spiking output is a continuous function of parameters. We formulate several learning tasks as minimization of certain information-theoretic cost functions that use spiking output probability distributions. We develop a generalized description of the stochastic spiking neuron and a new spiking neuron model that allows to flexibly process rich spatiotemporal data. We formulate and derive learning rules for the following tasks: - a supervised learning task of detecting a spatiotemporal pattern as a minimization of the negative log-likelihood (the surprisal) of the neuron’s output - an unsupervised learning task of increasing the stability of neurons output as a minimization of the entropy - a reinforcement learning task of controlling an agent as a modulated optimization of filtered surprisal of the neuron’s output. We test the derived learning rules in several experiments such as spatiotemporal pattern detection, spatiotemporal data storing and recall with autoassociative memory, combination of supervised and unsupervised learning to speed up the learning process, adaptive control of simple virtual agents in changing environments.
Tasks
Published	2016-02-15
URL	http://arxiv.org/abs/1602.04742v1
PDF	http://arxiv.org/pdf/1602.04742v1.pdf
PWC	https://paperswithcode.com/paper/training-of-spiking-neural-networks-based-on
Repo
Framework