October 17, 2019

3328 words 16 mins read

Paper Group ANR 807

Paper Group ANR 807

Exponential Discriminative Metric Embedding in Deep Learning. Focus On What’s Important: Self-Attention Model for Human Pose Estimation. Direct Network Transfer: Transfer Learning of Sentence Embeddings for Semantic Similarity. Human-Interactive Subgoal Supervision for Efficient Inverse Reinforcement Learning. Rediscovering Deep Neural Networks Thr …

Exponential Discriminative Metric Embedding in Deep Learning

Title Exponential Discriminative Metric Embedding in Deep Learning
Authors Bowen Wu, Zhangling Chen, Jun Wang, Huaming Wu
Abstract With the remarkable success achieved by the Convolutional Neural Networks (CNNs) in object recognition recently, deep learning is being widely used in the computer vision community. Deep Metric Learning (DML), integrating deep learning with conventional metric learning, has set new records in many fields, especially in classification task. In this paper, we propose a replicable DML method, called Include and Exclude (IE) loss, to force the distance between a sample and its designated class center away from the mean distance of this sample to other class centers with a large margin in the exponential feature projection space. With the supervision of IE loss, we can train CNNs to enhance the intra-class compactness and inter-class separability, leading to great improvements on several public datasets ranging from object recognition to face verification. We conduct a comparative study of our algorithm with several typical DML methods on three kinds of networks with different capacity. Extensive experiments on three object recognition datasets and two face recognition datasets demonstrate that IE loss is always superior to other mainstream DML methods and approach the state-of-the-art results.
Tasks Face Recognition, Face Verification, Metric Learning, Object Recognition
Published 2018-03-07
URL http://arxiv.org/abs/1803.02504v1
PDF http://arxiv.org/pdf/1803.02504v1.pdf
PWC https://paperswithcode.com/paper/exponential-discriminative-metric-embedding
Repo
Framework

Focus On What’s Important: Self-Attention Model for Human Pose Estimation

Title Focus On What’s Important: Self-Attention Model for Human Pose Estimation
Authors Guanxiong Sun, Chengqin Ye, Kuanquan Wang
Abstract Human pose estimation is an essential yet challenging task in computer vision. One of the reasons for this difficulty is that there are many redundant regions in the images. In this work, we proposed a convolutional network architecture combined with the novel attention model. We named it attention convolutional neural network (ACNN). ACNN learns to focus on specific regions of different input features. It’s a multi-stage architecture. Early stages filtrate the “nothing-regions”, such as background and redundant body parts. And then, they submit the important regions which contain the joints of the human body to the following stages to get a more accurate result. What’s more, it does not require extra manual annotations and self-learning is one of our intentions. We separately trained the network because the attention learning task and the pose estimation task are not independent. State-of-the-art performance is obtained on the MPII benchmarks.
Tasks Pose Estimation
Published 2018-09-22
URL http://arxiv.org/abs/1809.08371v2
PDF http://arxiv.org/pdf/1809.08371v2.pdf
PWC https://paperswithcode.com/paper/focus-on-whats-important-self-attention-model
Repo
Framework

Direct Network Transfer: Transfer Learning of Sentence Embeddings for Semantic Similarity

Title Direct Network Transfer: Transfer Learning of Sentence Embeddings for Semantic Similarity
Authors Li Zhang, Steven R. Wilson, Rada Mihalcea
Abstract Sentence encoders, which produce sentence embeddings using neural networks, are typically evaluated by how well they transfer to downstream tasks. This includes semantic similarity, an important task in natural language understanding. Although there has been much work dedicated to building sentence encoders, the accompanying transfer learning techniques have received relatively little attention. In this paper, we propose a transfer learning setting specialized for semantic similarity, which we refer to as direct network transfer. Through experiments on several standard text similarity datasets, we show that applying direct network transfer to existing encoders can lead to state-of-the-art performance. Additionally, we compare several approaches to transfer sentence encoders to semantic similarity tasks, showing that the choice of transfer learning setting greatly affects the performance in many cases, and differs by encoder and dataset.
Tasks Semantic Similarity, Semantic Textual Similarity, Sentence Embeddings, Transfer Learning
Published 2018-04-20
URL http://arxiv.org/abs/1804.07835v2
PDF http://arxiv.org/pdf/1804.07835v2.pdf
PWC https://paperswithcode.com/paper/direct-network-transfer-transfer-learning-of
Repo
Framework

Human-Interactive Subgoal Supervision for Efficient Inverse Reinforcement Learning

Title Human-Interactive Subgoal Supervision for Efficient Inverse Reinforcement Learning
Authors Xinlei Pan, Eshed Ohn-Bar, Nicholas Rhinehart, Yan Xu, Yilin Shen, Kris M. Kitani
Abstract Humans are able to understand and perform complex tasks by strategically structuring the tasks into incremental steps or subgoals. For a robot attempting to learn to perform a sequential task with critical subgoal states, such states can provide a natural opportunity for interaction with a human expert. This paper analyzes the benefit of incorporating a notion of subgoals into Inverse Reinforcement Learning (IRL) with a Human-In-The-Loop (HITL) framework. The learning process is interactive, with a human expert first providing input in the form of full demonstrations along with some subgoal states. These subgoal states define a set of subtasks for the learning agent to complete in order to achieve the final goal. The learning agent queries for partial demonstrations corresponding to each subtask as needed when the agent struggles with the subtask. The proposed Human Interactive IRL (HI-IRL) framework is evaluated on several discrete path-planning tasks. We demonstrate that subgoal-based interactive structuring of the learning task results in significantly more efficient learning, requiring only a fraction of the demonstration data needed for learning the underlying reward function with the baseline IRL model.
Tasks
Published 2018-06-22
URL http://arxiv.org/abs/1806.08479v1
PDF http://arxiv.org/pdf/1806.08479v1.pdf
PWC https://paperswithcode.com/paper/human-interactive-subgoal-supervision-for
Repo
Framework

Rediscovering Deep Neural Networks Through Finite-State Distributions

Title Rediscovering Deep Neural Networks Through Finite-State Distributions
Authors Amir Emad Marvasti, Ehsan Emad Marvasti, George Atia, Hassan Foroosh
Abstract We propose a new way of thinking about deep neural networks, in which the linear and non-linear components of the network are naturally derived and justified in terms of principles in probability theory. In particular, the models constructed in our framework assign probabilities to uncertain realizations, leading to Kullback-Leibler Divergence (KLD) as the linear layer. In our model construction, we also arrive at a structure similar to ReLU activation supported with Bayes’ theorem. The non-linearities in our framework are normalization layers with ReLU and Sigmoid as element-wise approximations. Additionally, the pooling function is derived as a marginalization of spatial random variables according to the mechanics of the framework. As such, Max Pooling is an approximation to the aforementioned marginalization process. Since our models are comprised of finite state distributions (FSD) as variables and parameters, exact computation of information-theoretic quantities such as entropy and KLD is possible, thereby providing more objective measures to analyze networks. Unlike existing designs that rely on heuristics, the proposed framework restricts subjective interpretations of CNNs and sheds light on the functionality of neural networks from a completely new perspective.
Tasks
Published 2018-09-26
URL https://arxiv.org/abs/1809.10073v2
PDF https://arxiv.org/pdf/1809.10073v2.pdf
PWC https://paperswithcode.com/paper/rediscovering-deep-neural-networks-in-finite
Repo
Framework

Dynamic Routing on Deep Neural Network for Thoracic Disease Classification and Sensitive Area Localization

Title Dynamic Routing on Deep Neural Network for Thoracic Disease Classification and Sensitive Area Localization
Authors Yan Shen, Mingchen Gao
Abstract We present and evaluate a new deep neural network architecture for automatic thoracic disease detection on chest X-rays. Deep neural networks have shown great success in a plethora of visual recognition tasks such as image classification and object detection by stacking multiple layers of convolutional neural networks (CNN) in a feed-forward manner. However, the performance gain by going deeper has reached bottlenecks as a result of the trade-off between model complexity and discrimination power. We address this problem by utilizing the recently developed routing-by agreement mechanism in our architecture. A novel characteristic of our network structure is that it extends routing to two types of layer connections (1) connection between feature maps in dense layers, (2) connection between primary capsules and prediction capsules in final classification layer. We show that our networks achieve comparable results with much fewer layers in the measurement of AUC score. We further show the combined benefits of model interpretability by generating Gradient-weighted Class Activation Mapping (Grad-CAM) for localization. We demonstrate our results on the NIH chestX-ray14 dataset that consists of 112,120 images on 30,805 unique patients including 14 kinds of lung diseases.
Tasks Image Classification, Object Detection, Thoracic Disease Classification
Published 2018-08-17
URL http://arxiv.org/abs/1808.05744v1
PDF http://arxiv.org/pdf/1808.05744v1.pdf
PWC https://paperswithcode.com/paper/dynamic-routing-on-deep-neural-network-for
Repo
Framework

An Evolutionary Hierarchical Interval Type-2 Fuzzy Knowledge Representation System (EHIT2FKRS) for Travel Route Assignment

Title An Evolutionary Hierarchical Interval Type-2 Fuzzy Knowledge Representation System (EHIT2FKRS) for Travel Route Assignment
Authors Mariam Zouari, Nesrine Baklouti, Javier Sanchez Medina, Mounir Ben Ayed, Adel M. Alimi
Abstract Urban Traffic Networks are characterized by high dynamics of traffic flow and increased travel time, including waiting times. This leads to more complex road traffic management. The present research paper suggests an innovative advanced traffic management system based on Hierarchical Interval Type-2 Fuzzy Logic model optimized by the Particle Swarm Optimization (PSO) method. The aim of designing this system is to perform dynamic route assignment to relieve traffic congestion and limit the unexpected fluctuation effects on traffic flow. The suggested system is executed and simulated using SUMO, a well-known microscopic traffic simulator. For the present study, we have tested four large and heterogeneous metropolitan areas located in the cities of Sfax, Luxembourg, Bologna and Cologne. The experimental results proved the effectiveness of learning the Hierarchical Interval type-2 Fuzzy logic using real time particle swarm optimization technique PSO to accomplish multiobjective optimality regarding two criteria: number of vehicles that reach their destination and average travel time. The obtained results are encouraging, confirming the efficiency of the proposed system.
Tasks
Published 2018-12-05
URL http://arxiv.org/abs/1812.01893v1
PDF http://arxiv.org/pdf/1812.01893v1.pdf
PWC https://paperswithcode.com/paper/an-evolutionary-hierarchical-interval-type-2
Repo
Framework

Deep Adaptive Proposal Network for Object Detection in Optical Remote Sensing Images

Title Deep Adaptive Proposal Network for Object Detection in Optical Remote Sensing Images
Authors Lin Cheng, Xu Liu, Lingling Li, Licheng Jiao, Xu Tang
Abstract Object detection is a fundamental and challenging problem in aerial and satellite image analysis. More recently, a two-stage detector Faster R-CNN is proposed and demonstrated to be a promising tool for object detection in optical remote sensing images, while the sparse and dense characteristic of objects in remote sensing images is complexity. It is unreasonable to treat all images with the same region proposal strategy, and this treatment limits the performance of two-stage detectors. In this paper, we propose a novel and effective approach, named deep adaptive proposal network (DAPNet), address this complexity characteristic of object by learning a new category prior network (CPN) on the basis of the existing Faster R-CNN architecture. Moreover, the candidate regions produced by DAPNet model are different from the traditional region proposal network (RPN), DAPNet predicts the detail category of each candidate region. And these candidate regions combine the object number, which generated by the category prior network to achieve a suitable number of candidate boxes for each image. These candidate boxes can satisfy detection tasks in sparse and dense scenes. The performance of the proposed framework has been evaluated on the challenging NWPU VHR-10 data set. Experimental results demonstrate the superiority of the proposed framework to the state-of-the-art.
Tasks Object Detection
Published 2018-07-19
URL http://arxiv.org/abs/1807.07327v1
PDF http://arxiv.org/pdf/1807.07327v1.pdf
PWC https://paperswithcode.com/paper/deep-adaptive-proposal-network-for-object
Repo
Framework

A Domain Agnostic Normalization Layer for Unsupervised Adversarial Domain Adaptation

Title A Domain Agnostic Normalization Layer for Unsupervised Adversarial Domain Adaptation
Authors Rob Romijnders, Panagiotis Meletis, Gijs Dubbelman
Abstract We propose a normalization layer for unsupervised domain adaption in semantic scene segmentation. Normalization layers are known to improve convergence and generalization and are part of many state-of-the-art fully-convolutional neural networks. We show that conventional normalization layers worsen the performance of current Unsupervised Adversarial Domain Adaption (UADA), which is a method to improve network performance on unlabeled datasets and the focus of our research. Therefore, we propose a novel Domain Agnostic Normalization layer and thereby unlock the benefits of normalization layers for unsupervised adversarial domain adaptation. In our evaluation, we adapt from the synthetic GTA5 data set to the real Cityscapes data set, a common benchmark experiment, and surpass the state-of-the-art. As our normalization layer is domain agnostic at test time, we furthermore demonstrate that UADA using Domain Agnostic Normalization improves performance on unseen domains, specifically on Apolloscape and Mapillary.
Tasks Domain Adaptation, Scene Segmentation
Published 2018-09-14
URL http://arxiv.org/abs/1809.05298v1
PDF http://arxiv.org/pdf/1809.05298v1.pdf
PWC https://paperswithcode.com/paper/a-domain-agnostic-normalization-layer-for
Repo
Framework

A Fast and Greedy Subset-of-Data (SoD) Scheme for Sparsification in Gaussian processes

Title A Fast and Greedy Subset-of-Data (SoD) Scheme for Sparsification in Gaussian processes
Authors Vidhi Lalchand, A. C. Faul
Abstract In their standard form Gaussian processes (GPs) provide a powerful non-parametric framework for regression and classificaton tasks. Their one limiting property is their $\mathcal{O}(N^{3})$ scaling where $N$ is the number of training data points. In this paper we present a framework for GP training with sequential selection of training data points using an intuitive selection metric. The greedy forward selection strategy is devised to target two factors - regions of high predictive uncertainty and underfit. Under this technique the complexity of GP training is reduced to $\mathcal{O}(M^{3})$ where $(M \ll N)$ if $M$ data points (out of $N$) are eventually selected. The sequential nature of the algorithm circumvents the need to invert the covariance matrix of dimension $N \times N$ and enables the use of favourable matrix inverse update identities. We outline the algorithm and sequential updates to the posterior mean and variance. We demonstrate our method on selected one dimensional functions and show that the loss in accuracy due to using a subset of data points is marginal compared to the computational gains.
Tasks Gaussian Processes
Published 2018-11-17
URL https://arxiv.org/abs/1811.07199v2
PDF https://arxiv.org/pdf/1811.07199v2.pdf
PWC https://paperswithcode.com/paper/a-greedy-approximation-scheme-for-sparse
Repo
Framework

Left Ventricle Segmentation and Volume Estimation on Cardiac MRI using Deep Learning

Title Left Ventricle Segmentation and Volume Estimation on Cardiac MRI using Deep Learning
Authors Ehab Abdelmaguid, Jolene Huang, Sanjay Kenchareddy, Disha Singla, Laura Wilke, Mai H. Nguyen, Ilkay Altintas
Abstract In the United States, heart disease is the leading cause of death for both men and women, accounting for 610,000 deaths each year [1]. Physicians use Magnetic Resonance Imaging (MRI) scans to take images of the heart in order to non-invasively estimate its structural and functional parameters for cardiovascular diagnosis and disease management. The end-systolic volume (ESV) and end-diastolic volume (EDV) of the left ventricle (LV), and the ejection fraction (EF) are indicators of heart disease. These measures can be derived from the segmented contours of the LV; thus, consistent and accurate segmentation of the LV from MRI images are critical to the accuracy of the ESV, EDV, and EF, and to non-invasive cardiac disease detection. In this work, various image preprocessing techniques, model configurations using the U-Net deep learning architecture, postprocessing methods, and approaches for volume estimation are investigated. An end-to-end analytics pipeline with multiple stages is provided for automated LV segmentation and volume estimation. First, image data are reformatted and processed from DICOM and NIfTI formats to raw images in array format. Secondly, raw images are processed with multiple image preprocessing methods and cropped to include only the Region of Interest (ROI). Thirdly, preprocessed images are segmented using U-Net models. Lastly, post processing of segmented images to remove extra contours along with intelligent slice and frame selection are applied, followed by calculation of the ESV, EDV, and EF. This analytics pipeline is implemented and runs on a distributed computing environment with a GPU cluster at the San Diego Supercomputer Center at UCSD.
Tasks
Published 2018-09-14
URL http://arxiv.org/abs/1809.06247v2
PDF http://arxiv.org/pdf/1809.06247v2.pdf
PWC https://paperswithcode.com/paper/left-ventricle-segmentation-and-volume
Repo
Framework

Matching Disparate Image Pairs Using Shape-Aware ConvNets

Title Matching Disparate Image Pairs Using Shape-Aware ConvNets
Authors Shefali Srivastava, Abhimanyu Chopra, Arun CS Kumar, Suchendra M. Bhandarkar, Deepak Sharma
Abstract An end-to-end trainable ConvNet architecture, that learns to harness the power of shape representation for matching disparate image pairs, is proposed. Disparate image pairs are deemed those that exhibit strong affine variations in scale, viewpoint and projection parameters accompanied by the presence of partial or complete occlusion of objects and extreme variations in ambient illumination. Under these challenging conditions, neither local nor global feature-based image matching methods, when used in isolation, have been observed to be effective. The proposed correspondence determination scheme for matching disparate images exploits high-level shape cues that are derived from low-level local feature descriptors, thus combining the best of both worlds. A graph-based representation for the disparate image pair is generated by constructing an affinity matrix that embeds the distances between feature points in two images, thus modeling the correspondence determination problem as one of graph matching. The eigenspectrum of the affinity matrix, i.e., the learned global shape representation, is then used to further regress the transformation or homography that defines the correspondence between the source image and target image. The proposed scheme is shown to yield state-of-the-art results for both, coarse-level shape matching as well as fine point-wise correspondence determination.
Tasks Graph Matching, Matching Disparate Images
Published 2018-11-24
URL http://arxiv.org/abs/1811.09889v1
PDF http://arxiv.org/pdf/1811.09889v1.pdf
PWC https://paperswithcode.com/paper/matching-disparate-image-pairs-using-shape
Repo
Framework

A Systems Approach to Achieving the Benefits of Artificial Intelligence in UK Defence

Title A Systems Approach to Achieving the Benefits of Artificial Intelligence in UK Defence
Authors Gavin Pearson, Phil Jolley, Geraint Evans
Abstract The ability to exploit the opportunities offered by AI within UK Defence calls for an understanding of systemic issues required to achieve an effective operational capability. This paper provides the authors’ views of issues which currently block UK Defence from fully benefitting from AI technology. These are situated within a reference model for the AI Value Train, so enabling the community to address the exploitation of such data and software intensive systems in a systematic, end to end manner. The paper sets out the conditions for success including: Researching future solutions to known problems and clearly defined use cases; Addressing achievable use cases to show benefit; Enhancing the availability of Defence-relevant data; Enhancing Defence ‘know how’ in AI; Operating Software Intensive supply chain eco-systems at required breadth and pace; Governance and, the integration of software and platform supply chains and operating models.
Tasks
Published 2018-09-28
URL http://arxiv.org/abs/1809.11089v1
PDF http://arxiv.org/pdf/1809.11089v1.pdf
PWC https://paperswithcode.com/paper/a-systems-approach-to-achieving-the-benefits
Repo
Framework

Detecting Nonlinear Causality in Multivariate Time Series with Sparse Additive Models

Title Detecting Nonlinear Causality in Multivariate Time Series with Sparse Additive Models
Authors Yingxiang Yang, Adams Wei Yu, Zhaoran Wang, Tuo Zhao
Abstract We propose a nonparametric method for detecting nonlinear causal relationship within a set of multidimensional discrete time series, by using sparse additive models (SpAMs). We show that, when the input to the SpAM is a $\beta$-mixing time series, the model can be fitted by first approximating each unknown function with a linear combination of a set of B-spline bases, and then solving a group-lasso-type optimization problem with nonconvex regularization. Theoretically, we characterize the oracle statistical properties of the proposed sparse estimator in function estimation and model selection. Numerically, we propose an efficient pathwise iterative shrinkage thresholding algorithm (PISTA), which tames the nonconvexity and guarantees linear convergence towards the desired sparse estimator with high probability.
Tasks Model Selection, Time Series
Published 2018-03-11
URL http://arxiv.org/abs/1803.03919v2
PDF http://arxiv.org/pdf/1803.03919v2.pdf
PWC https://paperswithcode.com/paper/detecting-nonlinear-causality-in-multivariate
Repo
Framework

Synaptic Cleft Segmentation in Non-Isotropic Volume Electron Microscopy of the Complete Drosophila Brain

Title Synaptic Cleft Segmentation in Non-Isotropic Volume Electron Microscopy of the Complete Drosophila Brain
Authors Larissa Heinrich, Jan Funke, Constantin Pape, Juan Nunez-Iglesias, Stephan Saalfeld
Abstract Neural circuit reconstruction at single synapse resolution is increasingly recognized as crucially important to decipher the function of biological nervous systems. Volume electron microscopy in serial transmission or scanning mode has been demonstrated to provide the necessary resolution to segment or trace all neurites and to annotate all synaptic connections. Automatic annotation of synaptic connections has been done successfully in near isotropic electron microscopy of vertebrate model organisms. Results on non-isotropic data in insect models, however, are not yet on par with human annotation. We designed a new 3D-U-Net architecture to optimally represent isotropic fields of view in non-isotropic data. We used regression on a signed distance transform of manually annotated synaptic clefts of the CREMI challenge dataset to train this model and observed significant improvement over the state of the art. We developed open source software for optimized parallel prediction on very large volumetric datasets and applied our model to predict synaptic clefts in a 50 tera-voxels dataset of the complete Drosophila brain. Our model generalizes well to areas far away from where training data was available.
Tasks
Published 2018-05-07
URL http://arxiv.org/abs/1805.02718v1
PDF http://arxiv.org/pdf/1805.02718v1.pdf
PWC https://paperswithcode.com/paper/synaptic-cleft-segmentation-in-non-isotropic
Repo
Framework
comments powered by Disqus