January 26, 2020

3080 words 15 mins read

Paper Group ANR 1568

Paper Group ANR 1568

Analysis and Visualization of Deep Neural Networks in Device-Free Wi-Fi Indoor Localization. Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning. Modeling and Analysis of Tagging Networks in Stack Exchange Communities. Exploiting Epistemic Uncertainty of Anatomy Segmentation for Anomaly Detection in Retinal OCT. Sparse2Den …

Analysis and Visualization of Deep Neural Networks in Device-Free Wi-Fi Indoor Localization

Title Analysis and Visualization of Deep Neural Networks in Device-Free Wi-Fi Indoor Localization
Authors Shing-Jiuan Liu, Ronald Y. Chang, Feng-Tsun Chien
Abstract Device-free Wi-Fi indoor localization has received significant attention as a key enabling technology for many Internet of Things (IoT) applications. Machine learning-based location estimators, such as the deep neural network (DNN), carry proven potential in achieving high-precision localization performance by automatically learning discriminative features from the noisy wireless signal measurements. However, the inner workings of DNNs are not transparent and not adequately understood especially in the indoor localization application. In this paper, we provide quantitative and visual explanations for the DNN learning process as well as the critical features that DNN has learned during the process. Toward this end, we propose to use several visualization techniques, including: 1) dimensionality reduction visualization, to project the high-dimensional feature space to the 2D space to facilitate visualization and interpretation, and 2) visual analytics and information visualization, to quantify relative contributions of each feature with the proposed feature manipulation procedures. The results provide insightful views and plausible explanations of the DNN in device-free Wi-Fi indoor localization using channel state information (CSI) fingerprints.
Tasks Dimensionality Reduction
Published 2019-04-23
URL https://arxiv.org/abs/1904.10154v2
PDF https://arxiv.org/pdf/1904.10154v2.pdf
PWC https://paperswithcode.com/paper/analysis-and-visualization-of-deep-neural
Repo
Framework

Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning

Title Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning
Authors Muhamad Risqi U. Saputra, Pedro P. B. de Gusmao, Sen Wang, Andrew Markham, Niki Trigoni
Abstract Inspired by the cognitive process of humans and animals, Curriculum Learning (CL) trains a model by gradually increasing the difficulty of the training data. In this paper, we study whether CL can be applied to complex geometry problems like estimating monocular Visual Odometry (VO). Unlike existing CL approaches, we present a novel CL strategy for learning the geometry of monocular VO by gradually making the learning objective more difficult during training. To this end, we propose a novel geometry-aware objective function by jointly optimizing relative and composite transformations over small windows via bounded pose regression loss. A cascade optical flow network followed by recurrent network with a differentiable windowed composition layer, termed CL-VO, is devised to learn the proposed objective. Evaluation on three real-world datasets shows superior performance of CL-VO over state-of-the-art feature-based and learning-based VO.
Tasks Monocular Visual Odometry, Optical Flow Estimation, Visual Odometry
Published 2019-03-25
URL https://arxiv.org/abs/1903.10543v2
PDF https://arxiv.org/pdf/1903.10543v2.pdf
PWC https://paperswithcode.com/paper/learning-monocular-visual-odometry-through
Repo
Framework

Modeling and Analysis of Tagging Networks in Stack Exchange Communities

Title Modeling and Analysis of Tagging Networks in Stack Exchange Communities
Authors Xiang Fu, Shangdi Yu, Austin R. Benson
Abstract Large Question-and-Answer (Q&A) platforms support diverse knowledge curation on the Web. While researchers have studied user behavior on the platforms in a variety of contexts, there is relatively little insight into important by-products of user behavior that also encode knowledge. Here, we analyze and model the macroscopic structure of tags applied by users to annotate and catalog questions, using a collection of 168 Stack Exchange websites. We find striking similarity in tagging structure across these Stack Exchange communities, even though each community evolves independently (albeit under similar guidelines). Using our empirical findings, we develop a simple generative model that creates random bipartite graphs of tags and questions. Our model accounts for the tag frequency distribution but does not explicitly account for co-tagging correlations. Even under these constraints, we demonstrate empirically and theoretically that our model can reproduce a number of statistical properties of the co-tagging graph that links tags appearing in the same post.
Tasks
Published 2019-02-06
URL http://arxiv.org/abs/1902.02372v1
PDF http://arxiv.org/pdf/1902.02372v1.pdf
PWC https://paperswithcode.com/paper/modeling-and-analysis-of-tagging-networks-in
Repo
Framework

Exploiting Epistemic Uncertainty of Anatomy Segmentation for Anomaly Detection in Retinal OCT

Title Exploiting Epistemic Uncertainty of Anatomy Segmentation for Anomaly Detection in Retinal OCT
Authors Philipp Seeböck, José Ignacio Orlando, Thomas Schlegl, Sebastian M. Waldstein, Hrvoje Bogunović, Sophie Klimscha, Georg Langs, Ursula Schmidt-Erfurth
Abstract Diagnosis and treatment guidance are aided by detecting relevant biomarkers in medical images. Although supervised deep learning can perform accurate segmentation of pathological areas, it is limited by requiring a-priori definitions of these regions, large-scale annotations, and a representative patient cohort in the training set. In contrast, anomaly detection is not limited to specific definitions of pathologies and allows for training on healthy samples without annotation. Anomalous regions can then serve as candidates for biomarker discovery. Knowledge about normal anatomical structure brings implicit information for detecting anomalies. We propose to take advantage of this property using bayesian deep learning, based on the assumption that epistemic uncertainties will correlate with anatomical deviations from a normal training set. A Bayesian U-Net is trained on a well-defined healthy environment using weak labels of healthy anatomy produced by existing methods. At test time, we capture epistemic uncertainty estimates of our model using Monte Carlo dropout. A novel post-processing technique is then applied to exploit these estimates and transfer their layered appearance to smooth blob-shaped segmentations of the anomalies. We experimentally validated this approach in retinal optical coherence tomography (OCT) images, using weak labels of retinal layers. Our method achieved a Dice index of 0.789 in an independent anomaly test set of age-related macular degeneration (AMD) cases. The resulting segmentations allowed very high accuracy for separating healthy and diseased cases with late wet AMD, dry geographic atrophy (GA), diabetic macular edema (DME) and retinal vein occlusion (RVO). Finally, we qualitatively observed that our approach can also detect other deviations in normal scans such as cut edge artifacts.
Tasks Anomaly Detection
Published 2019-05-29
URL https://arxiv.org/abs/1905.12806v1
PDF https://arxiv.org/pdf/1905.12806v1.pdf
PWC https://paperswithcode.com/paper/exploiting-epistemic-uncertainty-of-anatomy
Repo
Framework

Sparse2Dense: From direct sparse odometry to dense 3D reconstruction

Title Sparse2Dense: From direct sparse odometry to dense 3D reconstruction
Authors Jiexiong Tang, John Folkesson, Patric Jensfelt
Abstract In this paper, we proposed a new deep learning based dense monocular SLAM method. Compared to existing methods, the proposed framework constructs a dense 3D model via a sparse to dense mapping using learned surface normals. With single view learned depth estimation as prior for monocular visual odometry, we obtain both accurate positioning and high quality depth reconstruction. The depth and normal are predicted by a single network trained in a tightly coupled manner.Experimental results show that our method significantly improves the performance of visual tracking and depth prediction in comparison to the state-of-the-art in deep monocular dense SLAM.
Tasks 3D Reconstruction, Depth Estimation, Monocular Visual Odometry, Visual Odometry, Visual Tracking
Published 2019-03-21
URL http://arxiv.org/abs/1903.09199v1
PDF http://arxiv.org/pdf/1903.09199v1.pdf
PWC https://paperswithcode.com/paper/sparse2dense-from-direct-sparse-odometry-to
Repo
Framework

Multi-view Vector-valued Manifold Regularization for Multi-label Image Classification

Title Multi-view Vector-valued Manifold Regularization for Multi-label Image Classification
Authors Yong Luo, Dacheng Tao, Chang Xu, Chao Xu, Hong Liu, Yonggang Wen
Abstract In computer vision, image datasets used for classification are naturally associated with multiple labels and comprised of multiple views, because each image may contain several objects (e.g. pedestrian, bicycle and tree) and is properly characterized by multiple visual features (e.g. color, texture and shape). Currently available tools ignore either the label relationship or the view complementary. Motivated by the success of the vector-valued function that constructs matrix-valued kernels to explore the multi-label structure in the output space, we introduce multi-view vector-valued manifold regularization (MV$\mathbf{^3}$MR) to integrate multiple features. MV$\mathbf{^3}$MR exploits the complementary property of different features and discovers the intrinsic local geometry of the compact support shared by different features under the theme of manifold regularization. We conducted extensive experiments on two challenging, but popular datasets, PASCAL VOC’ 07 (VOC) and MIR Flickr (MIR), and validated the effectiveness of the proposed MV$\mathbf{^3}$MR for image classification.
Tasks Image Classification
Published 2019-04-08
URL http://arxiv.org/abs/1904.03921v1
PDF http://arxiv.org/pdf/1904.03921v1.pdf
PWC https://paperswithcode.com/paper/multi-view-vector-valued-manifold
Repo
Framework

Policy Message Passing: A New Algorithm for Probabilistic Graph Inference

Title Policy Message Passing: A New Algorithm for Probabilistic Graph Inference
Authors Zhiwei Deng, Greg Mori
Abstract A general graph-structured neural network architecture operates on graphs through two core components: (1) complex enough message functions; (2) a fixed information aggregation process. In this paper, we present the Policy Message Passing algorithm, which takes a probabilistic perspective and reformulates the whole information aggregation as stochastic sequential processes. The algorithm works on a much larger search space, utilizes reasoning history to perform inference, and is robust to noisy edges. We apply our algorithm to multiple complex graph reasoning and prediction tasks and show that our algorithm consistently outperforms state-of-the-art graph-structured models by a significant margin.
Tasks
Published 2019-09-29
URL https://arxiv.org/abs/1909.13196v1
PDF https://arxiv.org/pdf/1909.13196v1.pdf
PWC https://paperswithcode.com/paper/policy-message-passing-a-new-algorithm-for
Repo
Framework

FAN: Focused Attention Networks

Title FAN: Focused Attention Networks
Authors Chu Wang, Babak Samari, Vladimir Kim, Siddhartha Chaudhuri, Kaleem Siddiqi
Abstract Attention networks show promise for both vision and language tasks, by emphasizing relationships between constituent elements through weighting functions. Such elements could be regions in an image output by a region proposal network, or words in a sentence, represented by word embedding. Thus far the learning of attention weights has been driven solely by the minimization of task specific loss functions. We introduce a method for learning attention weights to better emphasize informative pair-wise relations between entities. The key component is a novel center-mass cross entropy loss, which can be applied in conjunction with the task specific ones. We further introduce a focused attention backbone to learn these attention weights for general tasks. We demonstrate that the focused supervision leads to improved attention distribution across meaningful entities, and that it enhances the representation by aggregating features from them. Our focused attention module leads to state-of-the-art recovery of relations in a relationship proposal task and boosts performance for various vision and language tasks.
Tasks Document Classification, Object Detection
Published 2019-05-27
URL https://arxiv.org/abs/1905.11498v3
PDF https://arxiv.org/pdf/1905.11498v3.pdf
PWC https://paperswithcode.com/paper/190511498
Repo
Framework

Variational Inference of Joint Models using Multivariate Gaussian Convolution Processes

Title Variational Inference of Joint Models using Multivariate Gaussian Convolution Processes
Authors Xubo Yue, Raed Kontar
Abstract We present a non-parametric prognostic framework for individualized event prediction based on joint modeling of both longitudinal and time-to-event data. Our approach exploits a multivariate Gaussian convolution process (MGCP) to model the evolution of longitudinal signals and a Cox model to map time-to-event data with longitudinal data modeled through the MGCP. Taking advantage of the unique structure imposed by convolved processes, we provide a variational inference framework to simultaneously estimate parameters in the joint MGCP-Cox model. This significantly reduces computational complexity and safeguards against model overfitting. Experiments on synthetic and real world data show that the proposed framework outperforms state-of-the art approaches built on two-stage inference and strong parametric assumptions.
Tasks
Published 2019-03-09
URL http://arxiv.org/abs/1903.03867v1
PDF http://arxiv.org/pdf/1903.03867v1.pdf
PWC https://paperswithcode.com/paper/variational-inference-of-joint-models-using
Repo
Framework

Butterfly: A Panacea for All Difficulties in Wildly Unsupervised Domain Adaptation

Title Butterfly: A Panacea for All Difficulties in Wildly Unsupervised Domain Adaptation
Authors Feng Liu, Jie Lu, Bo Han, Gang Niu, Guangquan Zhang, Masashi Sugiyama
Abstract In unsupervised domain adaptation (UDA), classifiers for the target domain (TD) are trained with clean labeled data from the source domain (SD) and unlabeled data from TD. However, in the wild, it is hard to acquire a large amount of perfectly clean labeled data in SD given limited budget. Hence, we consider a new, more realistic and more challenging problem setting, where classifiers have to be trained with noisy labeled data from SD and unlabeled data from TD—we name it wildly UDA (WUDA). We show that WUDA provably ruins all UDA methods if taking no care of label noise in SD, and to this end, we propose a Butterfly framework, a panacea for all difficulties in WUDA. Butterfly maintains four models (e.g., deep networks) simultaneously, where two take care of all adaptations (i.e., noisy-to-clean, labeled-to-unlabeled, and SD-to-TD-distributional) and then the other two can focus on classification in TD. As a consequence, Butterfly possesses all the necessary components for all the challenges in WUDA. Experiments demonstrate that under WUDA, Butterfly significantly outperforms existing baseline methods.
Tasks Domain Adaptation, Unsupervised Domain Adaptation
Published 2019-05-19
URL https://arxiv.org/abs/1905.07720v2
PDF https://arxiv.org/pdf/1905.07720v2.pdf
PWC https://paperswithcode.com/paper/butterfly-robust-one-step-approach-towards
Repo
Framework

Structural sparsification for Far-field Speaker Recognition with GNA

Title Structural sparsification for Far-field Speaker Recognition with GNA
Authors Jingchi Zhang, Jonathan Huang, Michael Deisher, Hai Li, Yiran Chen
Abstract Recently, deep neural networks (DNN) have been widely used in speaker recognition area. In order to achieve fast response time and high accuracy, the requirements for hardware resources increase rapidly. However, as the speaker recognition application is often implemented on mobile devices, it is necessary to maintain a low computational cost while keeping high accuracy in far-field condition. In this paper, we apply structural sparsification on time-delay neural networks (TDNN) to remove redundant structures and accelerate the execution. On our targeted hardware, our model can remove 60% of parameters and only slightly increasing equal error rate (EER) by 0.18% while our structural sparse model can achieve more than 1.5x speedup.
Tasks Speaker Recognition
Published 2019-10-25
URL https://arxiv.org/abs/1910.11488v2
PDF https://arxiv.org/pdf/1910.11488v2.pdf
PWC https://paperswithcode.com/paper/structural-sparsification-for-far-field
Repo
Framework

Novel Co-variant Feature Point Matching Based on Gaussian Mixture Model

Title Novel Co-variant Feature Point Matching Based on Gaussian Mixture Model
Authors Liang Shen, Jiahua Zhu, Chongyi Fan, Xiaotao Huang, Tian Jin
Abstract The feature frame is a key idea of feature matching problem between two images. However, most of the traditional matching methods only simply employ the spatial location information (the coordinates), which ignores the shape and orientation information of the local feature. Such additional information can be obtained along with coordinates using general co-variant detectors such as DOG, Hessian, Harris-Affine and MSER. In this paper, we develop a novel method considering all the feature center position coordinates, the local feature shape and orientation information based on Gaussian Mixture Model for co-variant feature matching. We proposed three sub-versions in our method for solving the matching problem in different conditions: rigid, affine and non-rigid, respectively, which all optimized by expectation maximization algorithm. Due to the effective utilization of the additional shape and orientation information, the proposed model can significantly improve the performance in terms of convergence speed and recall. Besides, it is more robust to the outliers.
Tasks
Published 2019-10-26
URL https://arxiv.org/abs/1910.11981v1
PDF https://arxiv.org/pdf/1910.11981v1.pdf
PWC https://paperswithcode.com/paper/novel-co-variant-feature-point-matching-based
Repo
Framework

Breast Cancer: Model Reconstruction and Image Registration from Segmented Deformed Image using Visual and Force based Analysis

Title Breast Cancer: Model Reconstruction and Image Registration from Segmented Deformed Image using Visual and Force based Analysis
Authors Shuvendu Rana, Rory Hampson, Gordon Dobie
Abstract Breast lesion localization using tactile imaging is a new and developing direction in medical science. To achieve the goal, proper image reconstruction and image registration can be a valuable asset. In this paper, a new approach of the segmentation-based image surface reconstruction algorithm is used to reconstruct the surface of a breast phantom. In breast tissue, the sub-dermal vein network is used as a distinguishable pattern for reconstruction. The proposed image capturing device contacts the surface of the phantom, and surface deformation will occur due to applied force at the time of scanning. A novel force based surface rectification system is used to reconstruct a deformed surface image to its original structure. For the construction of the full surface from rectified images, advanced affine scale-invariant feature transform (A-SIFT) is proposed to reduce the affine effect in time when data capturing. Camera position based image stitching approach is applied to construct the final original non-rigid surface. The proposed model is validated in theoretical models and real scenarios, to demonstrate its advantages with respect to competing methods. The result of the proposed method, applied to path reconstruction, ends with a positioning accuracy of 99.7%
Tasks Image Reconstruction, Image Registration, Image Stitching
Published 2019-02-14
URL https://arxiv.org/abs/1902.05340v4
PDF https://arxiv.org/pdf/1902.05340v4.pdf
PWC https://paperswithcode.com/paper/breast-cancer-model-reconstruction-and-image
Repo
Framework

Computer Vision and Metrics Learning for Hypothesis Testing: An Application of Q-Q Plot for Normality Test

Title Computer Vision and Metrics Learning for Hypothesis Testing: An Application of Q-Q Plot for Normality Test
Authors Ke-Wei Huang, Mengke Qiao, Xuanqi Liu, Siyuan Liu, Mingxi Dai
Abstract This paper proposes a new deep-learning method to construct test statistics by computer vision and metrics learning. The application highlighted in this paper is applying computer vision on Q-Q plot to construct a new test statistic for normality test. To the best of our knowledge, there is no similar application documented in the literature. Traditionally, there are two families of approaches for verifying the probability distribution of a random variable. Researchers either subjectively assess the Q-Q plot or objectively use a mathematical formula, such as Kolmogorov-Smirnov test, to formally conduct a normality test. Graphical assessment by human beings is not rigorous whereas normality test statistics may not be accurate enough when the uniformly most powerful test does not exist. It may take tens of years for statistician to develop a new test statistic that is more powerful statistically. Our proposed method integrates four components based on deep learning: an image representation learning component of a Q-Q plot, a dimension reduction component, a metrics learning component that best quantifies the differences between two Q-Q plots for normality test, and a new normality hypothesis testing process. Our experimentation results show that the machine-learning-based test statistics can outperform several widely-used traditional normality tests. This study provides convincing evidence that the proposed method could objectively create a powerful test statistic based on Q-Q plots and this method could be modified to construct many more powerful test statistics for other applications in the future.
Tasks Dimensionality Reduction, Metric Learning, Representation Learning
Published 2019-01-23
URL https://arxiv.org/abs/1901.07851v2
PDF https://arxiv.org/pdf/1901.07851v2.pdf
PWC https://paperswithcode.com/paper/computer-vision-and-metrics-learning-for
Repo
Framework

Gradient Weighted Superpixels for Interpretability in CNNs

Title Gradient Weighted Superpixels for Interpretability in CNNs
Authors Thomas Hartley, Kirill Sidorov, Christopher Willis, David Marshall
Abstract As Convolutional Neural Networks embed themselves into our everyday lives, the need for them to be interpretable increases. However, there is often a trade-off between methods that are efficient to compute but produce an explanation that is difficult to interpret, and those that are slow to compute but provide a more interpretable result. This is particularly challenging in problem spaces that require a large input volume, especially video which combines both spatial and temporal dimensions. In this work we introduce the idea of scoring superpixels through the use of gradient based pixel scoring techniques. We show qualitatively and quantitatively that this is able to approximate LIME, in a fraction of the time. We investigate our techniques using both image classification, and action recognition networks on large scale datasets (ImageNet and Kinetics-400 respectively).
Tasks Image Classification
Published 2019-08-16
URL https://arxiv.org/abs/1908.08997v1
PDF https://arxiv.org/pdf/1908.08997v1.pdf
PWC https://paperswithcode.com/paper/gradient-weighted-superpixels-for
Repo
Framework
comments powered by Disqus