Paper Group ANR 1568
Analysis and Visualization of Deep Neural Networks in Device-Free Wi-Fi Indoor Localization. Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning. Modeling and Analysis of Tagging Networks in Stack Exchange Communities. Exploiting Epistemic Uncertainty of Anatomy Segmentation for Anomaly Detection in Retinal OCT. Sparse2Den …
Analysis and Visualization of Deep Neural Networks in Device-Free Wi-Fi Indoor Localization
Title | Analysis and Visualization of Deep Neural Networks in Device-Free Wi-Fi Indoor Localization |
Authors | Shing-Jiuan Liu, Ronald Y. Chang, Feng-Tsun Chien |
Abstract | Device-free Wi-Fi indoor localization has received significant attention as a key enabling technology for many Internet of Things (IoT) applications. Machine learning-based location estimators, such as the deep neural network (DNN), carry proven potential in achieving high-precision localization performance by automatically learning discriminative features from the noisy wireless signal measurements. However, the inner workings of DNNs are not transparent and not adequately understood especially in the indoor localization application. In this paper, we provide quantitative and visual explanations for the DNN learning process as well as the critical features that DNN has learned during the process. Toward this end, we propose to use several visualization techniques, including: 1) dimensionality reduction visualization, to project the high-dimensional feature space to the 2D space to facilitate visualization and interpretation, and 2) visual analytics and information visualization, to quantify relative contributions of each feature with the proposed feature manipulation procedures. The results provide insightful views and plausible explanations of the DNN in device-free Wi-Fi indoor localization using channel state information (CSI) fingerprints. |
Tasks | Dimensionality Reduction |
Published | 2019-04-23 |
URL | https://arxiv.org/abs/1904.10154v2 |
https://arxiv.org/pdf/1904.10154v2.pdf | |
PWC | https://paperswithcode.com/paper/analysis-and-visualization-of-deep-neural |
Repo | |
Framework | |
Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning
Title | Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning |
Authors | Muhamad Risqi U. Saputra, Pedro P. B. de Gusmao, Sen Wang, Andrew Markham, Niki Trigoni |
Abstract | Inspired by the cognitive process of humans and animals, Curriculum Learning (CL) trains a model by gradually increasing the difficulty of the training data. In this paper, we study whether CL can be applied to complex geometry problems like estimating monocular Visual Odometry (VO). Unlike existing CL approaches, we present a novel CL strategy for learning the geometry of monocular VO by gradually making the learning objective more difficult during training. To this end, we propose a novel geometry-aware objective function by jointly optimizing relative and composite transformations over small windows via bounded pose regression loss. A cascade optical flow network followed by recurrent network with a differentiable windowed composition layer, termed CL-VO, is devised to learn the proposed objective. Evaluation on three real-world datasets shows superior performance of CL-VO over state-of-the-art feature-based and learning-based VO. |
Tasks | Monocular Visual Odometry, Optical Flow Estimation, Visual Odometry |
Published | 2019-03-25 |
URL | https://arxiv.org/abs/1903.10543v2 |
https://arxiv.org/pdf/1903.10543v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-monocular-visual-odometry-through |
Repo | |
Framework | |
Modeling and Analysis of Tagging Networks in Stack Exchange Communities
Title | Modeling and Analysis of Tagging Networks in Stack Exchange Communities |
Authors | Xiang Fu, Shangdi Yu, Austin R. Benson |
Abstract | Large Question-and-Answer (Q&A) platforms support diverse knowledge curation on the Web. While researchers have studied user behavior on the platforms in a variety of contexts, there is relatively little insight into important by-products of user behavior that also encode knowledge. Here, we analyze and model the macroscopic structure of tags applied by users to annotate and catalog questions, using a collection of 168 Stack Exchange websites. We find striking similarity in tagging structure across these Stack Exchange communities, even though each community evolves independently (albeit under similar guidelines). Using our empirical findings, we develop a simple generative model that creates random bipartite graphs of tags and questions. Our model accounts for the tag frequency distribution but does not explicitly account for co-tagging correlations. Even under these constraints, we demonstrate empirically and theoretically that our model can reproduce a number of statistical properties of the co-tagging graph that links tags appearing in the same post. |
Tasks | |
Published | 2019-02-06 |
URL | http://arxiv.org/abs/1902.02372v1 |
http://arxiv.org/pdf/1902.02372v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-and-analysis-of-tagging-networks-in |
Repo | |
Framework | |
Exploiting Epistemic Uncertainty of Anatomy Segmentation for Anomaly Detection in Retinal OCT
Title | Exploiting Epistemic Uncertainty of Anatomy Segmentation for Anomaly Detection in Retinal OCT |
Authors | Philipp Seeböck, José Ignacio Orlando, Thomas Schlegl, Sebastian M. Waldstein, Hrvoje Bogunović, Sophie Klimscha, Georg Langs, Ursula Schmidt-Erfurth |
Abstract | Diagnosis and treatment guidance are aided by detecting relevant biomarkers in medical images. Although supervised deep learning can perform accurate segmentation of pathological areas, it is limited by requiring a-priori definitions of these regions, large-scale annotations, and a representative patient cohort in the training set. In contrast, anomaly detection is not limited to specific definitions of pathologies and allows for training on healthy samples without annotation. Anomalous regions can then serve as candidates for biomarker discovery. Knowledge about normal anatomical structure brings implicit information for detecting anomalies. We propose to take advantage of this property using bayesian deep learning, based on the assumption that epistemic uncertainties will correlate with anatomical deviations from a normal training set. A Bayesian U-Net is trained on a well-defined healthy environment using weak labels of healthy anatomy produced by existing methods. At test time, we capture epistemic uncertainty estimates of our model using Monte Carlo dropout. A novel post-processing technique is then applied to exploit these estimates and transfer their layered appearance to smooth blob-shaped segmentations of the anomalies. We experimentally validated this approach in retinal optical coherence tomography (OCT) images, using weak labels of retinal layers. Our method achieved a Dice index of 0.789 in an independent anomaly test set of age-related macular degeneration (AMD) cases. The resulting segmentations allowed very high accuracy for separating healthy and diseased cases with late wet AMD, dry geographic atrophy (GA), diabetic macular edema (DME) and retinal vein occlusion (RVO). Finally, we qualitatively observed that our approach can also detect other deviations in normal scans such as cut edge artifacts. |
Tasks | Anomaly Detection |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12806v1 |
https://arxiv.org/pdf/1905.12806v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-epistemic-uncertainty-of-anatomy |
Repo | |
Framework | |
Sparse2Dense: From direct sparse odometry to dense 3D reconstruction
Title | Sparse2Dense: From direct sparse odometry to dense 3D reconstruction |
Authors | Jiexiong Tang, John Folkesson, Patric Jensfelt |
Abstract | In this paper, we proposed a new deep learning based dense monocular SLAM method. Compared to existing methods, the proposed framework constructs a dense 3D model via a sparse to dense mapping using learned surface normals. With single view learned depth estimation as prior for monocular visual odometry, we obtain both accurate positioning and high quality depth reconstruction. The depth and normal are predicted by a single network trained in a tightly coupled manner.Experimental results show that our method significantly improves the performance of visual tracking and depth prediction in comparison to the state-of-the-art in deep monocular dense SLAM. |
Tasks | 3D Reconstruction, Depth Estimation, Monocular Visual Odometry, Visual Odometry, Visual Tracking |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.09199v1 |
http://arxiv.org/pdf/1903.09199v1.pdf | |
PWC | https://paperswithcode.com/paper/sparse2dense-from-direct-sparse-odometry-to |
Repo | |
Framework | |
Multi-view Vector-valued Manifold Regularization for Multi-label Image Classification
Title | Multi-view Vector-valued Manifold Regularization for Multi-label Image Classification |
Authors | Yong Luo, Dacheng Tao, Chang Xu, Chao Xu, Hong Liu, Yonggang Wen |
Abstract | In computer vision, image datasets used for classification are naturally associated with multiple labels and comprised of multiple views, because each image may contain several objects (e.g. pedestrian, bicycle and tree) and is properly characterized by multiple visual features (e.g. color, texture and shape). Currently available tools ignore either the label relationship or the view complementary. Motivated by the success of the vector-valued function that constructs matrix-valued kernels to explore the multi-label structure in the output space, we introduce multi-view vector-valued manifold regularization (MV$\mathbf{^3}$MR) to integrate multiple features. MV$\mathbf{^3}$MR exploits the complementary property of different features and discovers the intrinsic local geometry of the compact support shared by different features under the theme of manifold regularization. We conducted extensive experiments on two challenging, but popular datasets, PASCAL VOC’ 07 (VOC) and MIR Flickr (MIR), and validated the effectiveness of the proposed MV$\mathbf{^3}$MR for image classification. |
Tasks | Image Classification |
Published | 2019-04-08 |
URL | http://arxiv.org/abs/1904.03921v1 |
http://arxiv.org/pdf/1904.03921v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-view-vector-valued-manifold |
Repo | |
Framework | |
Policy Message Passing: A New Algorithm for Probabilistic Graph Inference
Title | Policy Message Passing: A New Algorithm for Probabilistic Graph Inference |
Authors | Zhiwei Deng, Greg Mori |
Abstract | A general graph-structured neural network architecture operates on graphs through two core components: (1) complex enough message functions; (2) a fixed information aggregation process. In this paper, we present the Policy Message Passing algorithm, which takes a probabilistic perspective and reformulates the whole information aggregation as stochastic sequential processes. The algorithm works on a much larger search space, utilizes reasoning history to perform inference, and is robust to noisy edges. We apply our algorithm to multiple complex graph reasoning and prediction tasks and show that our algorithm consistently outperforms state-of-the-art graph-structured models by a significant margin. |
Tasks | |
Published | 2019-09-29 |
URL | https://arxiv.org/abs/1909.13196v1 |
https://arxiv.org/pdf/1909.13196v1.pdf | |
PWC | https://paperswithcode.com/paper/policy-message-passing-a-new-algorithm-for |
Repo | |
Framework | |
FAN: Focused Attention Networks
Title | FAN: Focused Attention Networks |
Authors | Chu Wang, Babak Samari, Vladimir Kim, Siddhartha Chaudhuri, Kaleem Siddiqi |
Abstract | Attention networks show promise for both vision and language tasks, by emphasizing relationships between constituent elements through weighting functions. Such elements could be regions in an image output by a region proposal network, or words in a sentence, represented by word embedding. Thus far the learning of attention weights has been driven solely by the minimization of task specific loss functions. We introduce a method for learning attention weights to better emphasize informative pair-wise relations between entities. The key component is a novel center-mass cross entropy loss, which can be applied in conjunction with the task specific ones. We further introduce a focused attention backbone to learn these attention weights for general tasks. We demonstrate that the focused supervision leads to improved attention distribution across meaningful entities, and that it enhances the representation by aggregating features from them. Our focused attention module leads to state-of-the-art recovery of relations in a relationship proposal task and boosts performance for various vision and language tasks. |
Tasks | Document Classification, Object Detection |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11498v3 |
https://arxiv.org/pdf/1905.11498v3.pdf | |
PWC | https://paperswithcode.com/paper/190511498 |
Repo | |
Framework | |
Variational Inference of Joint Models using Multivariate Gaussian Convolution Processes
Title | Variational Inference of Joint Models using Multivariate Gaussian Convolution Processes |
Authors | Xubo Yue, Raed Kontar |
Abstract | We present a non-parametric prognostic framework for individualized event prediction based on joint modeling of both longitudinal and time-to-event data. Our approach exploits a multivariate Gaussian convolution process (MGCP) to model the evolution of longitudinal signals and a Cox model to map time-to-event data with longitudinal data modeled through the MGCP. Taking advantage of the unique structure imposed by convolved processes, we provide a variational inference framework to simultaneously estimate parameters in the joint MGCP-Cox model. This significantly reduces computational complexity and safeguards against model overfitting. Experiments on synthetic and real world data show that the proposed framework outperforms state-of-the art approaches built on two-stage inference and strong parametric assumptions. |
Tasks | |
Published | 2019-03-09 |
URL | http://arxiv.org/abs/1903.03867v1 |
http://arxiv.org/pdf/1903.03867v1.pdf | |
PWC | https://paperswithcode.com/paper/variational-inference-of-joint-models-using |
Repo | |
Framework | |
Butterfly: A Panacea for All Difficulties in Wildly Unsupervised Domain Adaptation
Title | Butterfly: A Panacea for All Difficulties in Wildly Unsupervised Domain Adaptation |
Authors | Feng Liu, Jie Lu, Bo Han, Gang Niu, Guangquan Zhang, Masashi Sugiyama |
Abstract | In unsupervised domain adaptation (UDA), classifiers for the target domain (TD) are trained with clean labeled data from the source domain (SD) and unlabeled data from TD. However, in the wild, it is hard to acquire a large amount of perfectly clean labeled data in SD given limited budget. Hence, we consider a new, more realistic and more challenging problem setting, where classifiers have to be trained with noisy labeled data from SD and unlabeled data from TD—we name it wildly UDA (WUDA). We show that WUDA provably ruins all UDA methods if taking no care of label noise in SD, and to this end, we propose a Butterfly framework, a panacea for all difficulties in WUDA. Butterfly maintains four models (e.g., deep networks) simultaneously, where two take care of all adaptations (i.e., noisy-to-clean, labeled-to-unlabeled, and SD-to-TD-distributional) and then the other two can focus on classification in TD. As a consequence, Butterfly possesses all the necessary components for all the challenges in WUDA. Experiments demonstrate that under WUDA, Butterfly significantly outperforms existing baseline methods. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2019-05-19 |
URL | https://arxiv.org/abs/1905.07720v2 |
https://arxiv.org/pdf/1905.07720v2.pdf | |
PWC | https://paperswithcode.com/paper/butterfly-robust-one-step-approach-towards |
Repo | |
Framework | |
Structural sparsification for Far-field Speaker Recognition with GNA
Title | Structural sparsification for Far-field Speaker Recognition with GNA |
Authors | Jingchi Zhang, Jonathan Huang, Michael Deisher, Hai Li, Yiran Chen |
Abstract | Recently, deep neural networks (DNN) have been widely used in speaker recognition area. In order to achieve fast response time and high accuracy, the requirements for hardware resources increase rapidly. However, as the speaker recognition application is often implemented on mobile devices, it is necessary to maintain a low computational cost while keeping high accuracy in far-field condition. In this paper, we apply structural sparsification on time-delay neural networks (TDNN) to remove redundant structures and accelerate the execution. On our targeted hardware, our model can remove 60% of parameters and only slightly increasing equal error rate (EER) by 0.18% while our structural sparse model can achieve more than 1.5x speedup. |
Tasks | Speaker Recognition |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11488v2 |
https://arxiv.org/pdf/1910.11488v2.pdf | |
PWC | https://paperswithcode.com/paper/structural-sparsification-for-far-field |
Repo | |
Framework | |
Novel Co-variant Feature Point Matching Based on Gaussian Mixture Model
Title | Novel Co-variant Feature Point Matching Based on Gaussian Mixture Model |
Authors | Liang Shen, Jiahua Zhu, Chongyi Fan, Xiaotao Huang, Tian Jin |
Abstract | The feature frame is a key idea of feature matching problem between two images. However, most of the traditional matching methods only simply employ the spatial location information (the coordinates), which ignores the shape and orientation information of the local feature. Such additional information can be obtained along with coordinates using general co-variant detectors such as DOG, Hessian, Harris-Affine and MSER. In this paper, we develop a novel method considering all the feature center position coordinates, the local feature shape and orientation information based on Gaussian Mixture Model for co-variant feature matching. We proposed three sub-versions in our method for solving the matching problem in different conditions: rigid, affine and non-rigid, respectively, which all optimized by expectation maximization algorithm. Due to the effective utilization of the additional shape and orientation information, the proposed model can significantly improve the performance in terms of convergence speed and recall. Besides, it is more robust to the outliers. |
Tasks | |
Published | 2019-10-26 |
URL | https://arxiv.org/abs/1910.11981v1 |
https://arxiv.org/pdf/1910.11981v1.pdf | |
PWC | https://paperswithcode.com/paper/novel-co-variant-feature-point-matching-based |
Repo | |
Framework | |
Breast Cancer: Model Reconstruction and Image Registration from Segmented Deformed Image using Visual and Force based Analysis
Title | Breast Cancer: Model Reconstruction and Image Registration from Segmented Deformed Image using Visual and Force based Analysis |
Authors | Shuvendu Rana, Rory Hampson, Gordon Dobie |
Abstract | Breast lesion localization using tactile imaging is a new and developing direction in medical science. To achieve the goal, proper image reconstruction and image registration can be a valuable asset. In this paper, a new approach of the segmentation-based image surface reconstruction algorithm is used to reconstruct the surface of a breast phantom. In breast tissue, the sub-dermal vein network is used as a distinguishable pattern for reconstruction. The proposed image capturing device contacts the surface of the phantom, and surface deformation will occur due to applied force at the time of scanning. A novel force based surface rectification system is used to reconstruct a deformed surface image to its original structure. For the construction of the full surface from rectified images, advanced affine scale-invariant feature transform (A-SIFT) is proposed to reduce the affine effect in time when data capturing. Camera position based image stitching approach is applied to construct the final original non-rigid surface. The proposed model is validated in theoretical models and real scenarios, to demonstrate its advantages with respect to competing methods. The result of the proposed method, applied to path reconstruction, ends with a positioning accuracy of 99.7% |
Tasks | Image Reconstruction, Image Registration, Image Stitching |
Published | 2019-02-14 |
URL | https://arxiv.org/abs/1902.05340v4 |
https://arxiv.org/pdf/1902.05340v4.pdf | |
PWC | https://paperswithcode.com/paper/breast-cancer-model-reconstruction-and-image |
Repo | |
Framework | |
Computer Vision and Metrics Learning for Hypothesis Testing: An Application of Q-Q Plot for Normality Test
Title | Computer Vision and Metrics Learning for Hypothesis Testing: An Application of Q-Q Plot for Normality Test |
Authors | Ke-Wei Huang, Mengke Qiao, Xuanqi Liu, Siyuan Liu, Mingxi Dai |
Abstract | This paper proposes a new deep-learning method to construct test statistics by computer vision and metrics learning. The application highlighted in this paper is applying computer vision on Q-Q plot to construct a new test statistic for normality test. To the best of our knowledge, there is no similar application documented in the literature. Traditionally, there are two families of approaches for verifying the probability distribution of a random variable. Researchers either subjectively assess the Q-Q plot or objectively use a mathematical formula, such as Kolmogorov-Smirnov test, to formally conduct a normality test. Graphical assessment by human beings is not rigorous whereas normality test statistics may not be accurate enough when the uniformly most powerful test does not exist. It may take tens of years for statistician to develop a new test statistic that is more powerful statistically. Our proposed method integrates four components based on deep learning: an image representation learning component of a Q-Q plot, a dimension reduction component, a metrics learning component that best quantifies the differences between two Q-Q plots for normality test, and a new normality hypothesis testing process. Our experimentation results show that the machine-learning-based test statistics can outperform several widely-used traditional normality tests. This study provides convincing evidence that the proposed method could objectively create a powerful test statistic based on Q-Q plots and this method could be modified to construct many more powerful test statistics for other applications in the future. |
Tasks | Dimensionality Reduction, Metric Learning, Representation Learning |
Published | 2019-01-23 |
URL | https://arxiv.org/abs/1901.07851v2 |
https://arxiv.org/pdf/1901.07851v2.pdf | |
PWC | https://paperswithcode.com/paper/computer-vision-and-metrics-learning-for |
Repo | |
Framework | |
Gradient Weighted Superpixels for Interpretability in CNNs
Title | Gradient Weighted Superpixels for Interpretability in CNNs |
Authors | Thomas Hartley, Kirill Sidorov, Christopher Willis, David Marshall |
Abstract | As Convolutional Neural Networks embed themselves into our everyday lives, the need for them to be interpretable increases. However, there is often a trade-off between methods that are efficient to compute but produce an explanation that is difficult to interpret, and those that are slow to compute but provide a more interpretable result. This is particularly challenging in problem spaces that require a large input volume, especially video which combines both spatial and temporal dimensions. In this work we introduce the idea of scoring superpixels through the use of gradient based pixel scoring techniques. We show qualitatively and quantitatively that this is able to approximate LIME, in a fraction of the time. We investigate our techniques using both image classification, and action recognition networks on large scale datasets (ImageNet and Kinetics-400 respectively). |
Tasks | Image Classification |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.08997v1 |
https://arxiv.org/pdf/1908.08997v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-weighted-superpixels-for |
Repo | |
Framework | |