January 26, 2020

3080 words 15 mins read

Paper Group ANR 1568

Analysis and Visualization of Deep Neural Networks in Device-Free Wi-Fi Indoor Localization. Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning. Modeling and Analysis of Tagging Networks in Stack Exchange Communities. Exploiting Epistemic Uncertainty of Anatomy Segmentation for Anomaly Detection in Retinal OCT. Sparse2Den …

Analysis and Visualization of Deep Neural Networks in Device-Free Wi-Fi Indoor Localization


Title	Analysis and Visualization of Deep Neural Networks in Device-Free Wi-Fi Indoor Localization
Authors	Shing-Jiuan Liu, Ronald Y. Chang, Feng-Tsun Chien
Abstract	Device-free Wi-Fi indoor localization has received significant attention as a key enabling technology for many Internet of Things (IoT) applications. Machine learning-based location estimators, such as the deep neural network (DNN), carry proven potential in achieving high-precision localization performance by automatically learning discriminative features from the noisy wireless signal measurements. However, the inner workings of DNNs are not transparent and not adequately understood especially in the indoor localization application. In this paper, we provide quantitative and visual explanations for the DNN learning process as well as the critical features that DNN has learned during the process. Toward this end, we propose to use several visualization techniques, including: 1) dimensionality reduction visualization, to project the high-dimensional feature space to the 2D space to facilitate visualization and interpretation, and 2) visual analytics and information visualization, to quantify relative contributions of each feature with the proposed feature manipulation procedures. The results provide insightful views and plausible explanations of the DNN in device-free Wi-Fi indoor localization using channel state information (CSI) fingerprints.
Tasks	Dimensionality Reduction
Published	2019-04-23
URL	https://arxiv.org/abs/1904.10154v2
PDF	https://arxiv.org/pdf/1904.10154v2.pdf
PWC	https://paperswithcode.com/paper/analysis-and-visualization-of-deep-neural
Repo
Framework

Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning


Title	Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning
Authors	Muhamad Risqi U. Saputra, Pedro P. B. de Gusmao, Sen Wang, Andrew Markham, Niki Trigoni
Abstract	Inspired by the cognitive process of humans and animals, Curriculum Learning (CL) trains a model by gradually increasing the difficulty of the training data. In this paper, we study whether CL can be applied to complex geometry problems like estimating monocular Visual Odometry (VO). Unlike existing CL approaches, we present a novel CL strategy for learning the geometry of monocular VO by gradually making the learning objective more difficult during training. To this end, we propose a novel geometry-aware objective function by jointly optimizing relative and composite transformations over small windows via bounded pose regression loss. A cascade optical flow network followed by recurrent network with a differentiable windowed composition layer, termed CL-VO, is devised to learn the proposed objective. Evaluation on three real-world datasets shows superior performance of CL-VO over state-of-the-art feature-based and learning-based VO.
Tasks	Monocular Visual Odometry, Optical Flow Estimation, Visual Odometry
Published	2019-03-25
URL	https://arxiv.org/abs/1903.10543v2
PDF	https://arxiv.org/pdf/1903.10543v2.pdf
PWC	https://paperswithcode.com/paper/learning-monocular-visual-odometry-through
Repo
Framework

Modeling and Analysis of Tagging Networks in Stack Exchange Communities


Title	Modeling and Analysis of Tagging Networks in Stack Exchange Communities
Authors	Xiang Fu, Shangdi Yu, Austin R. Benson
Abstract	Large Question-and-Answer (Q&A) platforms support diverse knowledge curation on the Web. While researchers have studied user behavior on the platforms in a variety of contexts, there is relatively little insight into important by-products of user behavior that also encode knowledge. Here, we analyze and model the macroscopic structure of tags applied by users to annotate and catalog questions, using a collection of 168 Stack Exchange websites. We find striking similarity in tagging structure across these Stack Exchange communities, even though each community evolves independently (albeit under similar guidelines). Using our empirical findings, we develop a simple generative model that creates random bipartite graphs of tags and questions. Our model accounts for the tag frequency distribution but does not explicitly account for co-tagging correlations. Even under these constraints, we demonstrate empirically and theoretically that our model can reproduce a number of statistical properties of the co-tagging graph that links tags appearing in the same post.
Tasks
Published	2019-02-06
URL	http://arxiv.org/abs/1902.02372v1
PDF	http://arxiv.org/pdf/1902.02372v1.pdf
PWC	https://paperswithcode.com/paper/modeling-and-analysis-of-tagging-networks-in
Repo
Framework

Exploiting Epistemic Uncertainty of Anatomy Segmentation for Anomaly Detection in Retinal OCT


Title	Exploiting Epistemic Uncertainty of Anatomy Segmentation for Anomaly Detection in Retinal OCT
Authors	Philipp Seeböck, José Ignacio Orlando, Thomas Schlegl, Sebastian M. Waldstein, Hrvoje Bogunović, Sophie Klimscha, Georg Langs, Ursula Schmidt-Erfurth
Abstract	Diagnosis and treatment guidance are aided by detecting relevant biomarkers in medical images. Although supervised deep learning can perform accurate segmentation of pathological areas, it is limited by requiring a-priori definitions of these regions, large-scale annotations, and a representative patient cohort in the training set. In contrast, anomaly detection is not limited to specific definitions of pathologies and allows for training on healthy samples without annotation. Anomalous regions can then serve as candidates for biomarker discovery. Knowledge about normal anatomical structure brings implicit information for detecting anomalies. We propose to take advantage of this property using bayesian deep learning, based on the assumption that epistemic uncertainties will correlate with anatomical deviations from a normal training set. A Bayesian U-Net is trained on a well-defined healthy environment using weak labels of healthy anatomy produced by existing methods. At test time, we capture epistemic uncertainty estimates of our model using Monte Carlo dropout. A novel post-processing technique is then applied to exploit these estimates and transfer their layered appearance to smooth blob-shaped segmentations of the anomalies. We experimentally validated this approach in retinal optical coherence tomography (OCT) images, using weak labels of retinal layers. Our method achieved a Dice index of 0.789 in an independent anomaly test set of age-related macular degeneration (AMD) cases. The resulting segmentations allowed very high accuracy for separating healthy and diseased cases with late wet AMD, dry geographic atrophy (GA), diabetic macular edema (DME) and retinal vein occlusion (RVO). Finally, we qualitatively observed that our approach can also detect other deviations in normal scans such as cut edge artifacts.
Tasks	Anomaly Detection
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12806v1
PDF	https://arxiv.org/pdf/1905.12806v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-epistemic-uncertainty-of-anatomy
Repo
Framework

Sparse2Dense: From direct sparse odometry to dense 3D reconstruction


Title	Sparse2Dense: From direct sparse odometry to dense 3D reconstruction
Authors	Jiexiong Tang, John Folkesson, Patric Jensfelt
Abstract	In this paper, we proposed a new deep learning based dense monocular SLAM method. Compared to existing methods, the proposed framework constructs a dense 3D model via a sparse to dense mapping using learned surface normals. With single view learned depth estimation as prior for monocular visual odometry, we obtain both accurate positioning and high quality depth reconstruction. The depth and normal are predicted by a single network trained in a tightly coupled manner.Experimental results show that our method significantly improves the performance of visual tracking and depth prediction in comparison to the state-of-the-art in deep monocular dense SLAM.
Tasks	3D Reconstruction, Depth Estimation, Monocular Visual Odometry, Visual Odometry, Visual Tracking
Published	2019-03-21
URL	http://arxiv.org/abs/1903.09199v1
PDF	http://arxiv.org/pdf/1903.09199v1.pdf
PWC	https://paperswithcode.com/paper/sparse2dense-from-direct-sparse-odometry-to
Repo
Framework

Multi-view Vector-valued Manifold Regularization for Multi-label Image Classification


Title	Multi-view Vector-valued Manifold Regularization for Multi-label Image Classification
Authors	Yong Luo, Dacheng Tao, Chang Xu, Chao Xu, Hong Liu, Yonggang Wen
Abstract	In computer vision, image datasets used for classification are naturally associated with multiple labels and comprised of multiple views, because each image may contain several objects (e.g. pedestrian, bicycle and tree) and is properly characterized by multiple visual features (e.g. color, texture and shape). Currently available tools ignore either the label relationship or the view complementary. Motivated by the success of the vector-valued function that constructs matrix-valued kernels to explore the multi-label structure in the output space, we introduce multi-view vector-valued manifold regularization (MV$\mathbf{^3}$MR) to integrate multiple features. MV$\mathbf{^3}$MR exploits the complementary property of different features and discovers the intrinsic local geometry of the compact support shared by different features under the theme of manifold regularization. We conducted extensive experiments on two challenging, but popular datasets, PASCAL VOC’ 07 (VOC) and MIR Flickr (MIR), and validated the effectiveness of the proposed MV$\mathbf{^3}$MR for image classification.
Tasks	Image Classification
Published	2019-04-08
URL	http://arxiv.org/abs/1904.03921v1
PDF	http://arxiv.org/pdf/1904.03921v1.pdf
PWC	https://paperswithcode.com/paper/multi-view-vector-valued-manifold
Repo
Framework

Policy Message Passing: A New Algorithm for Probabilistic Graph Inference


Title	Policy Message Passing: A New Algorithm for Probabilistic Graph Inference
Authors	Zhiwei Deng, Greg Mori
Abstract	A general graph-structured neural network architecture operates on graphs through two core components: (1) complex enough message functions; (2) a fixed information aggregation process. In this paper, we present the Policy Message Passing algorithm, which takes a probabilistic perspective and reformulates the whole information aggregation as stochastic sequential processes. The algorithm works on a much larger search space, utilizes reasoning history to perform inference, and is robust to noisy edges. We apply our algorithm to multiple complex graph reasoning and prediction tasks and show that our algorithm consistently outperforms state-of-the-art graph-structured models by a significant margin.
Tasks
Published	2019-09-29
URL	https://arxiv.org/abs/1909.13196v1
PDF	https://arxiv.org/pdf/1909.13196v1.pdf
PWC	https://paperswithcode.com/paper/policy-message-passing-a-new-algorithm-for
Repo
Framework

FAN: Focused Attention Networks


Title	FAN: Focused Attention Networks
Authors	Chu Wang, Babak Samari, Vladimir Kim, Siddhartha Chaudhuri, Kaleem Siddiqi
Abstract	Attention networks show promise for both vision and language tasks, by emphasizing relationships between constituent elements through weighting functions. Such elements could be regions in an image output by a region proposal network, or words in a sentence, represented by word embedding. Thus far the learning of attention weights has been driven solely by the minimization of task specific loss functions. We introduce a method for learning attention weights to better emphasize informative pair-wise relations between entities. The key component is a novel center-mass cross entropy loss, which can be applied in conjunction with the task specific ones. We further introduce a focused attention backbone to learn these attention weights for general tasks. We demonstrate that the focused supervision leads to improved attention distribution across meaningful entities, and that it enhances the representation by aggregating features from them. Our focused attention module leads to state-of-the-art recovery of relations in a relationship proposal task and boosts performance for various vision and language tasks.
Tasks	Document Classification, Object Detection
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11498v3
PDF	https://arxiv.org/pdf/1905.11498v3.pdf
PWC	https://paperswithcode.com/paper/190511498
Repo
Framework

Variational Inference of Joint Models using Multivariate Gaussian Convolution Processes


Title	Variational Inference of Joint Models using Multivariate Gaussian Convolution Processes
Authors	Xubo Yue, Raed Kontar
Abstract	We present a non-parametric prognostic framework for individualized event prediction based on joint modeling of both longitudinal and time-to-event data. Our approach exploits a multivariate Gaussian convolution process (MGCP) to model the evolution of longitudinal signals and a Cox model to map time-to-event data with longitudinal data modeled through the MGCP. Taking advantage of the unique structure imposed by convolved processes, we provide a variational inference framework to simultaneously estimate parameters in the joint MGCP-Cox model. This significantly reduces computational complexity and safeguards against model overfitting. Experiments on synthetic and real world data show that the proposed framework outperforms state-of-the art approaches built on two-stage inference and strong parametric assumptions.
Tasks
Published	2019-03-09
URL	http://arxiv.org/abs/1903.03867v1
PDF	http://arxiv.org/pdf/1903.03867v1.pdf
PWC	https://paperswithcode.com/paper/variational-inference-of-joint-models-using
Repo
Framework

Butterfly: A Panacea for All Difficulties in Wildly Unsupervised Domain Adaptation


Title	Butterfly: A Panacea for All Difficulties in Wildly Unsupervised Domain Adaptation
Authors	Feng Liu, Jie Lu, Bo Han, Gang Niu, Guangquan Zhang, Masashi Sugiyama
Abstract	In unsupervised domain adaptation (UDA), classifiers for the target domain (TD) are trained with clean labeled data from the source domain (SD) and unlabeled data from TD. However, in the wild, it is hard to acquire a large amount of perfectly clean labeled data in SD given limited budget. Hence, we consider a new, more realistic and more challenging problem setting, where classifiers have to be trained with noisy labeled data from SD and unlabeled data from TD—we name it wildly UDA (WUDA). We show that WUDA provably ruins all UDA methods if taking no care of label noise in SD, and to this end, we propose a Butterfly framework, a panacea for all difficulties in WUDA. Butterfly maintains four models (e.g., deep networks) simultaneously, where two take care of all adaptations (i.e., noisy-to-clean, labeled-to-unlabeled, and SD-to-TD-distributional) and then the other two can focus on classification in TD. As a consequence, Butterfly possesses all the necessary components for all the challenges in WUDA. Experiments demonstrate that under WUDA, Butterfly significantly outperforms existing baseline methods.
Tasks	Domain Adaptation, Unsupervised Domain Adaptation
Published	2019-05-19
URL	https://arxiv.org/abs/1905.07720v2
PDF	https://arxiv.org/pdf/1905.07720v2.pdf
PWC	https://paperswithcode.com/paper/butterfly-robust-one-step-approach-towards
Repo
Framework

Structural sparsification for Far-field Speaker Recognition with GNA


Title	Structural sparsification for Far-field Speaker Recognition with GNA
Authors	Jingchi Zhang, Jonathan Huang, Michael Deisher, Hai Li, Yiran Chen
Abstract	Recently, deep neural networks (DNN) have been widely used in speaker recognition area. In order to achieve fast response time and high accuracy, the requirements for hardware resources increase rapidly. However, as the speaker recognition application is often implemented on mobile devices, it is necessary to maintain a low computational cost while keeping high accuracy in far-field condition. In this paper, we apply structural sparsification on time-delay neural networks (TDNN) to remove redundant structures and accelerate the execution. On our targeted hardware, our model can remove 60% of parameters and only slightly increasing equal error rate (EER) by 0.18% while our structural sparse model can achieve more than 1.5x speedup.
Tasks	Speaker Recognition
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11488v2
PDF	https://arxiv.org/pdf/1910.11488v2.pdf
PWC	https://paperswithcode.com/paper/structural-sparsification-for-far-field
Repo
Framework

Novel Co-variant Feature Point Matching Based on Gaussian Mixture Model


Title	Novel Co-variant Feature Point Matching Based on Gaussian Mixture Model
Authors	Liang Shen, Jiahua Zhu, Chongyi Fan, Xiaotao Huang, Tian Jin
Abstract	The feature frame is a key idea of feature matching problem between two images. However, most of the traditional matching methods only simply employ the spatial location information (the coordinates), which ignores the shape and orientation information of the local feature. Such additional information can be obtained along with coordinates using general co-variant detectors such as DOG, Hessian, Harris-Affine and MSER. In this paper, we develop a novel method considering all the feature center position coordinates, the local feature shape and orientation information based on Gaussian Mixture Model for co-variant feature matching. We proposed three sub-versions in our method for solving the matching problem in different conditions: rigid, affine and non-rigid, respectively, which all optimized by expectation maximization algorithm. Due to the effective utilization of the additional shape and orientation information, the proposed model can significantly improve the performance in terms of convergence speed and recall. Besides, it is more robust to the outliers.
Tasks
Published	2019-10-26
URL	https://arxiv.org/abs/1910.11981v1
PDF	https://arxiv.org/pdf/1910.11981v1.pdf
PWC	https://paperswithcode.com/paper/novel-co-variant-feature-point-matching-based
Repo
Framework

Breast Cancer: Model Reconstruction and Image Registration from Segmented Deformed Image using Visual and Force based Analysis


Title	Breast Cancer: Model Reconstruction and Image Registration from Segmented Deformed Image using Visual and Force based Analysis
Authors	Shuvendu Rana, Rory Hampson, Gordon Dobie
Abstract	Breast lesion localization using tactile imaging is a new and developing direction in medical science. To achieve the goal, proper image reconstruction and image registration can be a valuable asset. In this paper, a new approach of the segmentation-based image surface reconstruction algorithm is used to reconstruct the surface of a breast phantom. In breast tissue, the sub-dermal vein network is used as a distinguishable pattern for reconstruction. The proposed image capturing device contacts the surface of the phantom, and surface deformation will occur due to applied force at the time of scanning. A novel force based surface rectification system is used to reconstruct a deformed surface image to its original structure. For the construction of the full surface from rectified images, advanced affine scale-invariant feature transform (A-SIFT) is proposed to reduce the affine effect in time when data capturing. Camera position based image stitching approach is applied to construct the final original non-rigid surface. The proposed model is validated in theoretical models and real scenarios, to demonstrate its advantages with respect to competing methods. The result of the proposed method, applied to path reconstruction, ends with a positioning accuracy of 99.7%
Tasks	Image Reconstruction, Image Registration, Image Stitching
Published	2019-02-14
URL	https://arxiv.org/abs/1902.05340v4
PDF	https://arxiv.org/pdf/1902.05340v4.pdf
PWC	https://paperswithcode.com/paper/breast-cancer-model-reconstruction-and-image
Repo
Framework

Computer Vision and Metrics Learning for Hypothesis Testing: An Application of Q-Q Plot for Normality Test


Title	Computer Vision and Metrics Learning for Hypothesis Testing: An Application of Q-Q Plot for Normality Test
Authors	Ke-Wei Huang, Mengke Qiao, Xuanqi Liu, Siyuan Liu, Mingxi Dai
Abstract	This paper proposes a new deep-learning method to construct test statistics by computer vision and metrics learning. The application highlighted in this paper is applying computer vision on Q-Q plot to construct a new test statistic for normality test. To the best of our knowledge, there is no similar application documented in the literature. Traditionally, there are two families of approaches for verifying the probability distribution of a random variable. Researchers either subjectively assess the Q-Q plot or objectively use a mathematical formula, such as Kolmogorov-Smirnov test, to formally conduct a normality test. Graphical assessment by human beings is not rigorous whereas normality test statistics may not be accurate enough when the uniformly most powerful test does not exist. It may take tens of years for statistician to develop a new test statistic that is more powerful statistically. Our proposed method integrates four components based on deep learning: an image representation learning component of a Q-Q plot, a dimension reduction component, a metrics learning component that best quantifies the differences between two Q-Q plots for normality test, and a new normality hypothesis testing process. Our experimentation results show that the machine-learning-based test statistics can outperform several widely-used traditional normality tests. This study provides convincing evidence that the proposed method could objectively create a powerful test statistic based on Q-Q plots and this method could be modified to construct many more powerful test statistics for other applications in the future.
Tasks	Dimensionality Reduction, Metric Learning, Representation Learning
Published	2019-01-23
URL	https://arxiv.org/abs/1901.07851v2
PDF	https://arxiv.org/pdf/1901.07851v2.pdf
PWC	https://paperswithcode.com/paper/computer-vision-and-metrics-learning-for
Repo
Framework

Gradient Weighted Superpixels for Interpretability in CNNs


Title	Gradient Weighted Superpixels for Interpretability in CNNs
Authors	Thomas Hartley, Kirill Sidorov, Christopher Willis, David Marshall
Abstract	As Convolutional Neural Networks embed themselves into our everyday lives, the need for them to be interpretable increases. However, there is often a trade-off between methods that are efficient to compute but produce an explanation that is difficult to interpret, and those that are slow to compute but provide a more interpretable result. This is particularly challenging in problem spaces that require a large input volume, especially video which combines both spatial and temporal dimensions. In this work we introduce the idea of scoring superpixels through the use of gradient based pixel scoring techniques. We show qualitatively and quantitatively that this is able to approximate LIME, in a fraction of the time. We investigate our techniques using both image classification, and action recognition networks on large scale datasets (ImageNet and Kinetics-400 respectively).
Tasks	Image Classification
Published	2019-08-16
URL	https://arxiv.org/abs/1908.08997v1
PDF	https://arxiv.org/pdf/1908.08997v1.pdf
PWC	https://paperswithcode.com/paper/gradient-weighted-superpixels-for
Repo
Framework