October 17, 2019

3104 words 15 mins read

Paper Group ANR 748

Paper Group ANR 748

Perfusion parameter estimation using neural networks and data augmentation. Detail Preserving Depth Estimation from a Single Image Using Attention Guided Networks. Consistent Position Bias Estimation without Online Interventions for Learning-to-Rank. Visual Localization of Key Positions for Visually Impaired People. Learning Object Localization and …

Perfusion parameter estimation using neural networks and data augmentation

Title Perfusion parameter estimation using neural networks and data augmentation
Authors David Robben, Paul Suetens
Abstract Perfusion imaging plays a crucial role in acute stroke diagnosis and treatment decision making. Current perfusion analysis relies on deconvolution of the measured signals, an operation that is mathematically ill-conditioned and requires strong regularization. We propose a neural network and a data augmentation approach to predict perfusion parameters directly from the native measurements. A comparison on simulated CT Perfusion data shows that the neural network provides better estimations for both CBF and Tmax than a state of the art deconvolution method, and this over a wide range of noise levels. The proposed data augmentation enables to achieve these results with less than 100 datasets.
Tasks Data Augmentation, Decision Making
Published 2018-10-11
URL http://arxiv.org/abs/1810.04898v1
PDF http://arxiv.org/pdf/1810.04898v1.pdf
PWC https://paperswithcode.com/paper/perfusion-parameter-estimation-using-neural
Repo
Framework

Detail Preserving Depth Estimation from a Single Image Using Attention Guided Networks

Title Detail Preserving Depth Estimation from a Single Image Using Attention Guided Networks
Authors Zhixiang Hao, Yu Li, Shaodi You, Feng Lu
Abstract Convolutional Neural Networks have demonstrated superior performance on single image depth estimation in recent years. These works usually use stacked spatial pooling or strided convolution to get high-level information which are common practices in classification task. However, depth estimation is a dense prediction problem and low-resolution feature maps usually generate blurred depth map which is undesirable in application. In order to produce high quality depth map, say clean and accurate, we propose a network consists of a Dense Feature Extractor (DFE) and a Depth Map Generator (DMG). The DFE combines ResNet and dilated convolutions. It extracts multi-scale information from input image while keeping the feature maps dense. As for DMG, we use attention mechanism to fuse multi-scale features produced in DFE. Our Network is trained end-to-end and does not need any post-processing. Hence, it runs fast and can predict depth map in about 15 fps. Experiment results show that our method is competitive with the state-of-the-art in quantitative evaluation, but can preserve better structural details of the scene depth.
Tasks Depth Estimation
Published 2018-09-03
URL http://arxiv.org/abs/1809.00646v1
PDF http://arxiv.org/pdf/1809.00646v1.pdf
PWC https://paperswithcode.com/paper/detail-preserving-depth-estimation-from-a
Repo
Framework

Consistent Position Bias Estimation without Online Interventions for Learning-to-Rank

Title Consistent Position Bias Estimation without Online Interventions for Learning-to-Rank
Authors Aman Agarwal, Ivan Zaitsev, Thorsten Joachims
Abstract Presentation bias is one of the key challenges when learning from implicit feedback in search engines, as it confounds the relevance signal with uninformative signals due to position in the ranking, saliency, and other presentation factors. While it was recently shown how counterfactual learning-to-rank (LTR) approaches \cite{Joachims/etal/17a} can provably overcome presentation bias if observation propensities are known, it remains to show how to accurately estimate these propensities. In this paper, we propose the first method for producing consistent propensity estimates without manual relevance judgments, disruptive interventions, or restrictive relevance modeling assumptions. We merely require that we have implicit feedback data from multiple different ranking functions. Furthermore, we argue that our estimation technique applies to an extended class of Contextual Position-Based Propensity Models, where propensities not only depend on position but also on observable features of the query and document. Initial simulation studies confirm that the approach is scalable, accurate, and robust.
Tasks Learning-To-Rank
Published 2018-06-09
URL http://arxiv.org/abs/1806.03555v1
PDF http://arxiv.org/pdf/1806.03555v1.pdf
PWC https://paperswithcode.com/paper/consistent-position-bias-estimation-without
Repo
Framework

Visual Localization of Key Positions for Visually Impaired People

Title Visual Localization of Key Positions for Visually Impaired People
Authors Ruiqi Cheng, Kaiwei Wang, Longqing Lin, Kailun Yang
Abstract On the off-the-shelf navigational assistance devices, the localization precision is limited to the signal error of global navigation satellite system (GNSS). During travelling outdoors, the inaccurately localization perplexes visually impaired people, especially at key positions, such as gates, bus stations or intersections. The visual localization is a feasible approach to improving the positioning precision of assistive devices. Using multiple image descriptors, the paper proposes a robust and efficient visual localization algorithm, which takes advantage of priori GNSS signals and multi-modal images to achieve the accurate localization of key positions. In the experiments, we implement the approach on the wearable system and test the performance of visual localization under practical scenarios.
Tasks Visual Localization
Published 2018-10-09
URL http://arxiv.org/abs/1810.03790v1
PDF http://arxiv.org/pdf/1810.03790v1.pdf
PWC https://paperswithcode.com/paper/visual-localization-of-key-positions-for
Repo
Framework

Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images

Title Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images
Authors Jean-Philippe Mercier, Chaitanya Mitash, Philippe Giguère, Abdeslam Boularias
Abstract This work proposes a process for efficiently training a point-wise object detector that enables localizing objects and computing their 6D poses in cluttered and occluded scenes. Accurate pose estimation is typically a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. To minimize the human labor required for annotation, the proposed object detector is first trained in simulation by using automatically annotated synthetic images. We then show that the performance of the detector can be substantially improved by using a small set of weakly annotated real images, where a human provides only a list of objects present in each image without indicating the location of the objects. To close the gap between real and synthetic images, we adopt a domain adaptation approach through adversarial training. The detector resulting from this training process can be used to localize objects by using its per-object activation maps. In this work, we use the activation maps to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.
Tasks 6D Pose Estimation, 6D Pose Estimation using RGB, Domain Adaptation, Object Localization, Pose Estimation, Robotic Grasping
Published 2018-06-18
URL http://arxiv.org/abs/1806.06888v2
PDF http://arxiv.org/pdf/1806.06888v2.pdf
PWC https://paperswithcode.com/paper/learning-object-localization-and-6d-pose
Repo
Framework

Learning to rank for censored survival data

Title Learning to rank for censored survival data
Authors Margaux Luck, Tristan Sylvain, Joseph Paul Cohen, Heloise Cardinal, Andrea Lodi, Yoshua Bengio
Abstract Survival analysis is a type of semi-supervised ranking task where the target output (the survival time) is often right-censored. Utilizing this information is a challenge because it is not obvious how to correctly incorporate these censored examples into a model. We study how three categories of loss functions, namely partial likelihood methods, rank methods, and our classification method based on a Wasserstein metric (WM) and the non-parametric Kaplan Meier estimate of the probability density to impute the labels of censored examples, can take advantage of this information. The proposed method allows us to have a model that predict the probability distribution of an event. If a clinician had access to the detailed probability of an event over time this would help in treatment planning. For example, determining if the risk of kidney graft rejection is constant or peaked after some time. Also, we demonstrate that this approach directly optimizes the expected C-index which is the most common evaluation metric for ranking survival models.
Tasks Learning-To-Rank, Survival Analysis
Published 2018-06-06
URL http://arxiv.org/abs/1806.01984v2
PDF http://arxiv.org/pdf/1806.01984v2.pdf
PWC https://paperswithcode.com/paper/learning-to-rank-for-censored-survival-data
Repo
Framework

More Than a Feeling: Learning to Grasp and Regrasp using Vision and Touch

Title More Than a Feeling: Learning to Grasp and Regrasp using Vision and Touch
Authors Roberto Calandra, Andrew Owens, Dinesh Jayaraman, Justin Lin, Wenzhen Yuan, Jitendra Malik, Edward H. Adelson, Sergey Levine
Abstract For humans, the process of grasping an object relies heavily on rich tactile feedback. Most recent robotic grasping work, however, has been based only on visual input, and thus cannot easily benefit from feedback after initiating contact. In this paper, we investigate how a robot can learn to use tactile information to iteratively and efficiently adjust its grasp. To this end, we propose an end-to-end action-conditional model that learns regrasping policies from raw visuo-tactile data. This model – a deep, multimodal convolutional network – predicts the outcome of a candidate grasp adjustment, and then executes a grasp by iteratively selecting the most promising actions. Our approach requires neither calibration of the tactile sensors, nor any analytical modeling of contact forces, thus reducing the engineering effort required to obtain efficient grasping policies. We train our model with data from about 6,450 grasping trials on a two-finger gripper equipped with GelSight high-resolution tactile sensors on each finger. Across extensive experiments, our approach outperforms a variety of baselines at (i) estimating grasp adjustment outcomes, (ii) selecting efficient grasp adjustments for quick grasping, and (iii) reducing the amount of force applied at the fingers, while maintaining competitive performance. Finally, we study the choices made by our model and show that it has successfully acquired useful and interpretable grasping behaviors.
Tasks Calibration, Robotic Grasping
Published 2018-05-28
URL http://arxiv.org/abs/1805.11085v2
PDF http://arxiv.org/pdf/1805.11085v2.pdf
PWC https://paperswithcode.com/paper/more-than-a-feeling-learning-to-grasp-and
Repo
Framework

Multi-bin Trainable Linear Unit for Fast Image Restoration Networks

Title Multi-bin Trainable Linear Unit for Fast Image Restoration Networks
Authors Shuhang Gu, Radu Timofte, Luc Van Gool
Abstract Tremendous advances in image restoration tasks such as denoising and super-resolution have been achieved using neural networks. Such approaches generally employ very deep architectures, large number of parameters, large receptive fields and high nonlinear modeling capacity. In order to obtain efficient and fast image restoration networks one should improve upon the above mentioned requirements. In this paper we propose a novel activation function, the multi-bin trainable linear unit (MTLU), for increasing the nonlinear modeling capacity together with lighter and shallower networks. We validate the proposed fast image restoration networks for image denoising (FDnet) and super-resolution (FSRnet) on standard benchmarks. We achieve large improvements in both memory and runtime over current state-of-the-art for comparable or better PSNR accuracies.
Tasks Denoising, Image Denoising, Image Restoration, Super-Resolution
Published 2018-07-30
URL http://arxiv.org/abs/1807.11389v1
PDF http://arxiv.org/pdf/1807.11389v1.pdf
PWC https://paperswithcode.com/paper/multi-bin-trainable-linear-unit-for-fast
Repo
Framework

Edge-Based Recognition of Novel Objects for Robotic Grasping

Title Edge-Based Recognition of Novel Objects for Robotic Grasping
Authors Amirhossein Jabalameli, Nabil Ettehadi, Aman Behal
Abstract In this paper, we investigate the problem of grasping novel objects in unstructured environments. To address this problem, consideration of the object geometry, reachability and force closure analysis are required. We propose a framework for grasping unknown objects by localizing contact regions on the contours formed by a set of depth edges in a single view 2D depth image. According to the edge geometric features obtained from analyzing the data of the depth map, the contact regions are determined. Finally,We validate the performance of the approach by applying it to the scenes with both single and multiple objects, using Baxter manipulator.
Tasks Robotic Grasping
Published 2018-02-23
URL http://arxiv.org/abs/1802.08753v1
PDF http://arxiv.org/pdf/1802.08753v1.pdf
PWC https://paperswithcode.com/paper/edge-based-recognition-of-novel-objects-for
Repo
Framework

Spot the Difference by Object Detection

Title Spot the Difference by Object Detection
Authors Junhui Wu, Yun Ye, Yu Chen, Zhi Weng
Abstract In this paper, we propose a simple yet effective solution to a change detection task that detects the difference between two images, which we call “spot the difference”. Our approach uses CNN-based object detection by stacking two aligned images as input and considering the differences between the two images as objects to detect. An early-merging architecture is used as the backbone network. Our method is accurate, fast and robust while using very cheap annotation. We verify the proposed method on the task of change detection between the digital design and its photographic image of a book. Compared to verification based methods, our object detection based method outperforms other methods by a large margin and gives extra information of location. We compress the network and achieve 24 times acceleration while keeping the accuracy. Besides, as we synthesize the training data for detection using weakly labeled images, our method does not need expensive bounding box annotation.
Tasks Object Detection
Published 2018-01-03
URL http://arxiv.org/abs/1801.01051v1
PDF http://arxiv.org/pdf/1801.01051v1.pdf
PWC https://paperswithcode.com/paper/spot-the-difference-by-object-detection
Repo
Framework

Mixed Supervised Object Detection with Robust Objectness Transfer

Title Mixed Supervised Object Detection with Robust Objectness Transfer
Authors Yan Li, Junge Zhang, Kaiqi Huang, Jianguo Zhang
Abstract In this paper, we consider the problem of leveraging existing fully labeled categories to improve the weakly supervised detection (WSD) of new object categories, which we refer to as mixed supervised detection (MSD). Different from previous MSD methods that directly transfer the pre-trained object detectors from existing categories to new categories, we propose a more reasonable and robust objectness transfer approach for MSD. In our framework, we first learn domain-invariant objectness knowledge from the existing fully labeled categories. The knowledge is modeled based on invariant features that are robust to the distribution discrepancy between the existing categories and new categories; therefore the resulting knowledge would generalize well to new categories and could assist detection models to reject distractors (e.g., object parts) in weakly labeled images of new categories. Under the guidance of learned objectness knowledge, we utilize multiple instance learning (MIL) to model the concepts of both objects and distractors and to further improve the ability of rejecting distractors in weakly labeled images. Our robust objectness transfer approach outperforms the existing MSD methods, and achieves state-of-the-art results on the challenging ILSVRC2013 detection dataset and the PASCAL VOC datasets.
Tasks Multiple Instance Learning, Object Detection
Published 2018-02-27
URL https://arxiv.org/abs/1802.09778v3
PDF https://arxiv.org/pdf/1802.09778v3.pdf
PWC https://paperswithcode.com/paper/mixed-supervised-object-detection-with-robust
Repo
Framework

A Batched Scalable Multi-Objective Bayesian Optimization Algorithm

Title A Batched Scalable Multi-Objective Bayesian Optimization Algorithm
Authors Xi Lin, Hui-Ling Zhen, Zhenhua Li, Qingfu Zhang, Sam Kwong
Abstract The surrogate-assisted optimization algorithm is a promising approach for solving expensive multi-objective optimization problems. However, most existing surrogate-assisted multi-objective optimization algorithms have three main drawbacks: 1) cannot scale well for solving problems with high dimensional decision space, 2) cannot incorporate available gradient information, and 3) do not support batch optimization. These drawbacks prevent their use for solving many real-world large scale optimization problems. This paper proposes a batched scalable multi-objective Bayesian optimization algorithm to tackle these issues. The proposed algorithm uses the Bayesian neural network as the scalable surrogate model. Powered with Monte Carlo dropout and Sobolov training, the model can be easily trained and can incorporate available gradient information. We also propose a novel batch hypervolume upper confidence bound acquisition function to support batch optimization. Experimental results on various benchmark problems and a real-world application demonstrate the efficiency of the proposed algorithm.
Tasks
Published 2018-11-04
URL http://arxiv.org/abs/1811.01323v1
PDF http://arxiv.org/pdf/1811.01323v1.pdf
PWC https://paperswithcode.com/paper/a-batched-scalable-multi-objective-bayesian
Repo
Framework

fMRI Semantic Category Decoding using Linguistic Encoding of Word Embeddings

Title fMRI Semantic Category Decoding using Linguistic Encoding of Word Embeddings
Authors Subba Reddy Oota, Naresh Manwani, Bapi Raju S
Abstract The dispute of how the human brain represents conceptual knowledge has been argued in many scientific fields. Brain imaging studies have shown that the spatial patterns of neural activation in the brain are correlated with thinking about different semantic categories of words (for example, tools, animals, and buildings) or when viewing the related pictures. In this paper, we present a computational model that learns to predict the neural activation captured in functional magnetic resonance imaging (fMRI) data of test words. Unlike the models with hand-crafted features that have been used in the literature, in this paper we propose a novel approach wherein decoding models are built with features extracted from popular linguistic encodings of Word2Vec, GloVe, Meta-Embeddings in conjunction with the empirical fMRI data associated with viewing several dozen concrete nouns. We compared these models with several other models that use word features extracted from FastText, Randomly-generated features, Mitchell’s 25 features [1]. The experimental results show that the predicted fMRI images using Meta-Embeddings meet the state-of-the-art performance. Although models with features from GloVe and Word2Vec predict fMRI images similar to the state-of-the-art model, model with features from Meta-Embeddings predicts significantly better. The proposed scheme that uses popular linguistic encoding offers a simple and easy approach for semantic decoding from fMRI experiments.
Tasks Word Embeddings
Published 2018-06-13
URL http://arxiv.org/abs/1806.05177v1
PDF http://arxiv.org/pdf/1806.05177v1.pdf
PWC https://paperswithcode.com/paper/fmri-semantic-category-decoding-using
Repo
Framework

Reliable Identification of Redundant Kernels for Convolutional Neural Network Compression

Title Reliable Identification of Redundant Kernels for Convolutional Neural Network Compression
Authors Wei Wang, Liqiang Zhu
Abstract To compress deep convolutional neural networks (CNNs) with large memory footprint and long inference time, this paper proposes a novel pruning criterion using layer-wised Ln-norm of feature maps. Different from existing pruning criteria, which are mainly based on L1-norm of convolution kernels, the proposed method utilizes Ln-norm of output feature maps after non-linear activations, where n is a variable, increasing from 1 at the first convolution layer to inf at the last convolution layer. With the ability of accurately identifying unimportant convolution kernels, the proposed method achieves a good balance between model size and inference accuracy. The experiments on ImageNet and the successful application in railway surveillance system show that the proposed method outperforms existing kernel-norm-based methods and is generally applicable to any deep neural network with convolution operations.
Tasks Neural Network Compression
Published 2018-12-10
URL http://arxiv.org/abs/1812.03608v1
PDF http://arxiv.org/pdf/1812.03608v1.pdf
PWC https://paperswithcode.com/paper/reliable-identification-of-redundant-kernels
Repo
Framework

User Loss – A Forced-Choice-Inspired Approach to Train Neural Networks directly by User Interaction

Title User Loss – A Forced-Choice-Inspired Approach to Train Neural Networks directly by User Interaction
Authors Shahab Zarei, Bernhard Stimpel, Christopher Syben, Andreas Maier
Abstract In this paper, we investigate whether is it possible to train a neural network directly from user inputs. We consider this approach to be highly relevant for applications in which the point of optimality is not well-defined and user-dependent. Our application is medical image denoising which is essential in fluoroscopy imaging. In this field every user, i.e. physician, has a different flavor and image quality needs to be tailored towards each individual. To address this important problem, we propose to construct a loss function derived from a forced-choice experiment. In order to make the learning problem feasible, we operate in the domain of precision learning, i.e., we inspire the network architecture by traditional signal processing methods in order to reduce the number of trainable parameters. The algorithm that was used for this is a Laplacian pyramid with only six trainable parameters. In the experimental results, we demonstrate that two image experts who prefer different filter characteristics between sharpness and de-noising can be created using our approach. Also models trained for a specific user perform best on this users test data. This approach opens the way towards implementation of direct user feedback in deep learning and is applicable for a wide range of application.
Tasks Denoising, Image Denoising
Published 2018-07-24
URL http://arxiv.org/abs/1807.09303v2
PDF http://arxiv.org/pdf/1807.09303v2.pdf
PWC https://paperswithcode.com/paper/user-loss-a-forced-choice-inspired-approach
Repo
Framework
comments powered by Disqus