Paper Group ANR 288
UBSegNet: Unified Biometric Region of Interest Segmentation Network. Convolutional Neural Networks: Ensemble Modeling, Fine-Tuning and Unsupervised Semantic Localization for Intraoperative CLE Images. Effective Spoken Language Labeling with Deep Recurrent Neural Networks. Efficient and Effective Single-Document Summarizations and A Word-Embedding M …
UBSegNet: Unified Biometric Region of Interest Segmentation Network
Title | UBSegNet: Unified Biometric Region of Interest Segmentation Network |
Authors | Ranjeet Ranjan Jha, Daksh Thapar, Shreyas Malakarjun Patil, Aditya Nigam |
Abstract | Digital human identity management, can now be seen as a social necessity, as it is essentially required in almost every public sector such as, financial inclusions, security, banking, social networking e.t.c. Hence, in today’s rampantly emerging world with so many adversarial entities, relying on a single biometric trait is being too optimistic. In this paper, we have proposed a novel end-to-end, Unified Biometric ROI Segmentation Network (UBSegNet), for extracting region of interest from five different biometric traits viz. face, iris, palm, knuckle and 4-slap fingerprint. The architecture of the proposed UBSegNet consists of two stages: (i) Trait classification and (ii) Trait localization. For these stages, we have used a state of the art region based convolutional neural network (RCNN), comprising of three major parts namely convolutional layers, region proposal network (RPN) along with classification and regression heads. The model has been evaluated over various huge publicly available biometric databases. To the best of our knowledge this is the first unified architecture proposed, segmenting multiple biometric traits. It has been tested over around 5000 * 5 = 25,000 images (5000 images per trait) and produces very good results. Our work on unified biometric segmentation, opens up the vast opportunities in the field of multiple biometric traits based authentication systems. |
Tasks | |
Published | 2017-09-26 |
URL | http://arxiv.org/abs/1709.08924v1 |
http://arxiv.org/pdf/1709.08924v1.pdf | |
PWC | https://paperswithcode.com/paper/ubsegnet-unified-biometric-region-of-interest |
Repo | |
Framework | |
Convolutional Neural Networks: Ensemble Modeling, Fine-Tuning and Unsupervised Semantic Localization for Intraoperative CLE Images
Title | Convolutional Neural Networks: Ensemble Modeling, Fine-Tuning and Unsupervised Semantic Localization for Intraoperative CLE Images |
Authors | Mohammadhassan Izadyyazdanabadi, Evgenii Belykh, Michael Mooney, Nikolay Martirosyan, Jennifer Eschbacher, Peter Nakaji, Mark C. Preul, Yezhou Yang |
Abstract | Confocal laser endomicroscopy (CLE) is an advanced optical fluorescence technology undergoing assessment for applications in brain tumor surgery. Despite its promising potential, interpreting the unfamiliar gray tone images of fluorescent stains can be difficult. Many of the CLE images can be distorted by motion, extremely low or high fluorescence signal, or obscured by red blood cell accumulation, and these can be interpreted as nondiagnostic. However, just one neat CLE image might suffice for intraoperative diagnosis of the tumor. While manual examination of thousands of nondiagnostic images during surgery would be impractical, this creates an opportunity for a model to select diagnostic images for the pathologists or surgeon’s review. In this study, we sought to develop a deep learning model to automatically detect the diagnostic images using a manually annotated dataset, and we employed a patient-based nested cross-validation approach to explore generalizability of the model. We explored various training regimes: deep training, shallow fine-tuning, and deep fine-tuning. Further, we investigated the effect of ensemble modeling by combining the top-5 single models crafted in the development phase. We localized histological features from diagnostic CLE images by visualization of shallow and deep neural activations. Our inter-rater experiment results confirmed that our ensemble of deeply fine-tuned models achieved higher agreement with the ground truth than the other observers. With the speed and precision of the proposed method (110 images/second; 85% on the gold standard test subset), it has potential to be integrated into the operative workflow in the brain tumor surgery. |
Tasks | |
Published | 2017-09-10 |
URL | http://arxiv.org/abs/1709.03028v2 |
http://arxiv.org/pdf/1709.03028v2.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-networks-ensemble |
Repo | |
Framework | |
Effective Spoken Language Labeling with Deep Recurrent Neural Networks
Title | Effective Spoken Language Labeling with Deep Recurrent Neural Networks |
Authors | Marco Dinarelli, Yoann Dupont, Isabelle Tellier |
Abstract | Understanding spoken language is a highly complex problem, which can be decomposed into several simpler tasks. In this paper, we focus on Spoken Language Understanding (SLU), the module of spoken dialog systems responsible for extracting a semantic interpretation from the user utterance. The task is treated as a labeling problem. In the past, SLU has been performed with a wide variety of probabilistic models. The rise of neural networks, in the last couple of years, has opened new interesting research directions in this domain. Recurrent Neural Networks (RNNs) in particular are able not only to represent several pieces of information as embeddings but also, thanks to their recurrent architecture, to encode as embeddings relatively long contexts. Such long contexts are in general out of reach for models previously used for SLU. In this paper we propose novel RNNs architectures for SLU which outperform previous ones. Starting from a published idea as base block, we design new deep RNNs achieving state-of-the-art results on two widely used corpora for SLU: ATIS (Air Traveling Information System), in English, and MEDIA (Hotel information and reservation in France), in French. |
Tasks | Spoken Language Understanding |
Published | 2017-06-20 |
URL | http://arxiv.org/abs/1706.06896v1 |
http://arxiv.org/pdf/1706.06896v1.pdf | |
PWC | https://paperswithcode.com/paper/effective-spoken-language-labeling-with-deep |
Repo | |
Framework | |
Efficient and Effective Single-Document Summarizations and A Word-Embedding Measurement of Quality
Title | Efficient and Effective Single-Document Summarizations and A Word-Embedding Measurement of Quality |
Authors | Liqun Shao, Hao Zhang, Ming Jia, Jie Wang |
Abstract | Our task is to generate an effective summary for a given document with specific realtime requirements. We use the softplus function to enhance keyword rankings to favor important sentences, based on which we present a number of summarization algorithms using various keyword extraction and topic clustering methods. We show that our algorithms meet the realtime requirements and yield the best ROUGE recall scores on DUC-02 over all previously-known algorithms. We show that our algorithms meet the realtime requirements and yield the best ROUGE recall scores on DUC-02 over all previously-known algorithms. To evaluate the quality of summaries without human-generated benchmarks, we define a measure called WESM based on word-embedding using Word Mover’s Distance. We show that the orderings of the ROUGE and WESM scores of our algorithms are highly comparable, suggesting that WESM may serve as a viable alternative for measuring the quality of a summary. |
Tasks | Keyword Extraction |
Published | 2017-10-01 |
URL | http://arxiv.org/abs/1710.00284v1 |
http://arxiv.org/pdf/1710.00284v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-and-effective-single-document |
Repo | |
Framework | |
Deep Blind Image Inpainting
Title | Deep Blind Image Inpainting |
Authors | Yang Liu, Jinshan Pan, Zhixun Su |
Abstract | Image inpainting is a challenging problem as it needs to fill the information of the corrupted regions. Most of the existing inpainting algorithms assume that the positions of the corrupted regions are known. Different from the existing methods that usually make some assumptions on the corrupted regions, we present an efficient blind image inpainting algorithm to directly restore a clear image from a corrupted input. Our algorithm is motivated by the residual learning algorithm which aims to learn the missing infor- mation in corrupted regions. However, directly using exist- ing residual learning algorithms in image restoration does not well solve this problem as little information is available in the corrupted regions. To solve this problem, we introduce an encoder and decoder architecture to capture more useful information and develop a robust loss function to deal with outliers. Our algorithm can predict the missing information in the corrupted regions, thus facilitating the clear image restoration. Both qualitative and quantitative experimental demonstrate that our algorithm can deal with the corrupted regions of arbitrary shapes and performs favorably against state-of-the-art methods. |
Tasks | Image Inpainting, Image Restoration |
Published | 2017-12-25 |
URL | http://arxiv.org/abs/1712.09078v1 |
http://arxiv.org/pdf/1712.09078v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-blind-image-inpainting |
Repo | |
Framework | |
Multiple Instance Hybrid Estimator for Hyperspectral Target Characterization and Sub-pixel Target Detection
Title | Multiple Instance Hybrid Estimator for Hyperspectral Target Characterization and Sub-pixel Target Detection |
Authors | Changzhe Jiao, Chao Chen, Ronald G. McGarvey, Stephanie Bohlman, Licheng Jiao, Alina Zare |
Abstract | The Multiple Instance Hybrid Estimator for discriminative target characterization from imprecisely labeled hyperspectral data is presented. In many hyperspectral target detection problems, acquiring accurately labeled training data is difficult. Furthermore, each pixel containing target is likely to be a mixture of both target and non-target signatures (i.e., sub-pixel targets), making extracting a pure prototype signature for the target class from the data extremely difficult. The proposed approach addresses these problems by introducing a data mixing model and optimizing the response of the hybrid sub-pixel detector within a multiple instance learning framework. The proposed approach iterates between estimating a set of discriminative target and non-target signatures and solving a sparse unmixing problem. After learning target signatures, a signature based detector can then be applied on test data. Both simulated and real hyperspectral target detection experiments show the proposed algorithm is effective at learning discriminative target signatures and achieves superior performance over state-of-the-art comparison algorithms. |
Tasks | Multiple Instance Learning |
Published | 2017-10-31 |
URL | http://arxiv.org/abs/1710.11599v2 |
http://arxiv.org/pdf/1710.11599v2.pdf | |
PWC | https://paperswithcode.com/paper/multiple-instance-hybrid-estimator-for-1 |
Repo | |
Framework | |
Translating Phrases in Neural Machine Translation
Title | Translating Phrases in Neural Machine Translation |
Authors | Xing Wang, Zhaopeng Tu, Deyi Xiong, Min Zhang |
Abstract | Phrases play an important role in natural language understanding and machine translation (Sag et al., 2002; Villavicencio et al., 2005). However, it is difficult to integrate them into current neural machine translation (NMT) which reads and generates sentences word by word. In this work, we propose a method to translate phrases in NMT by integrating a phrase memory storing target phrases from a phrase-based statistical machine translation (SMT) system into the encoder-decoder architecture of NMT. At each decoding step, the phrase memory is first re-written by the SMT model, which dynamically generates relevant target phrases with contextual information provided by the NMT model. Then the proposed model reads the phrase memory to make probability estimations for all phrases in the phrase memory. If phrase generation is carried on, the NMT decoder selects an appropriate phrase from the memory to perform phrase translation and updates its decoding state by consuming the words in the selected phrase. Otherwise, the NMT decoder generates a word from the vocabulary as the general NMT decoder does. Experiment results on the Chinese to English translation show that the proposed model achieves significant improvements over the baseline on various test sets. |
Tasks | Machine Translation |
Published | 2017-08-07 |
URL | http://arxiv.org/abs/1708.01980v1 |
http://arxiv.org/pdf/1708.01980v1.pdf | |
PWC | https://paperswithcode.com/paper/translating-phrases-in-neural-machine |
Repo | |
Framework | |
Molecular enhanced sampling with autoencoders: On-the-fly collective variable discovery and accelerated free energy landscape exploration
Title | Molecular enhanced sampling with autoencoders: On-the-fly collective variable discovery and accelerated free energy landscape exploration |
Authors | Wei Chen, Andrew L Ferguson |
Abstract | Macromolecular and biomolecular folding landscapes typically contain high free energy barriers that impede efficient sampling of configurational space by standard molecular dynamics simulation. Biased sampling can artificially drive the simulation along pre-specified collective variables (CVs), but success depends critically on the availability of good CVs associated with the important collective dynamical motions. Nonlinear machine learning techniques can identify such CVs but typically do not furnish an explicit relationship with the atomic coordinates necessary to perform biased sampling. In this work, we employ auto-associative artificial neural networks (“autoencoders”) to learn nonlinear CVs that are explicit and differentiable functions of the atomic coordinates. Our approach offers substantial speedups in exploration of configurational space, and is distinguished from exiting approaches by its capacity to simultaneously discover and directly accelerate along data-driven CVs. We demonstrate the approach in simulations of alanine dipeptide and Trp-cage, and have developed an open-source and freely-available implementation within OpenMM. |
Tasks | |
Published | 2017-12-30 |
URL | http://arxiv.org/abs/1801.00203v2 |
http://arxiv.org/pdf/1801.00203v2.pdf | |
PWC | https://paperswithcode.com/paper/molecular-enhanced-sampling-with-autoencoders |
Repo | |
Framework | |
Matched bipartite block model with covariates
Title | Matched bipartite block model with covariates |
Authors | Zahra S. Razaee, Arash A. Amini, Jingyi Jessica Li |
Abstract | Community detection or clustering is a fundamental task in the analysis of network data. Many real networks have a bipartite structure which makes community detection challenging. In this paper, we consider a model which allows for matched communities in the bipartite setting, in addition to node covariates with information about the matching. We derive a simple fast algorithm for fitting the model based on variational inference ideas and show its effectiveness on both simulated and real data. A variation of the model to allow for degree-correction is also considered, in addition to a novel approach to fitting such degree-corrected models. |
Tasks | Community Detection |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.04943v1 |
http://arxiv.org/pdf/1703.04943v1.pdf | |
PWC | https://paperswithcode.com/paper/matched-bipartite-block-model-with-covariates |
Repo | |
Framework | |
Weighting Scheme for a Pairwise Multi-label Classifier Based on the Fuzzy Confusion Matrix
Title | Weighting Scheme for a Pairwise Multi-label Classifier Based on the Fuzzy Confusion Matrix |
Authors | Pawel Trajdos, Marek Kurzynski |
Abstract | In this work we addressed the issue of applying a stochastic classifier and a local, fuzzy confusion matrix under the framework of multi-label classification. We proposed a novel solution to the problem of correcting label pairwise ensembles. The main step of the correction procedure is to compute classifier-specific competence and cross-competence measures, which estimates error pattern of the underlying classifier. At the fusion phase we employed two weighting approaches based on information theory. The classifier weights promote base classifiers which are the most susceptible to the correction based on the fuzzy confusion matrix. During the experimental study, the proposed approach was compared against two reference methods. The comparison was made in terms of six different quality criteria. The conducted experiments reveals that the proposed approach eliminates one of main drawbacks of the original FCM-based approach i.e. the original approach is vulnerable to the imbalanced class/label distribution. What is more, the obtained results shows that the introduced method achieves satisfying classification quality under all considered quality criteria. Additionally, the impact of fluctuations of data set characteristics is reduced. |
Tasks | Multi-Label Classification |
Published | 2017-10-25 |
URL | http://arxiv.org/abs/1710.09710v2 |
http://arxiv.org/pdf/1710.09710v2.pdf | |
PWC | https://paperswithcode.com/paper/weighting-scheme-for-a-pairwise-multi-label |
Repo | |
Framework | |
Noise-Tolerant Interactive Learning from Pairwise Comparisons
Title | Noise-Tolerant Interactive Learning from Pairwise Comparisons |
Authors | Yichong Xu, Hongyang Zhang, Aarti Singh, Kyle Miller, Artur Dubrawski |
Abstract | We study the problem of interactively learning a binary classifier using noisy labeling and pairwise comparison oracles, where the comparison oracle answers which one in the given two instances is more likely to be positive. Learning from such oracles has multiple applications where obtaining direct labels is harder but pairwise comparisons are easier, and the algorithm can leverage both types of oracles. In this paper, we attempt to characterize how the access to an easier comparison oracle helps in improving the label and total query complexity. We show that the comparison oracle reduces the learning problem to that of learning a threshold function. We then present an algorithm that interactively queries the label and comparison oracles and we characterize its query complexity under Tsybakov and adversarial noise conditions for the comparison and labeling oracles. Our lower bounds show that our label and total query complexity is almost optimal. |
Tasks | |
Published | 2017-04-19 |
URL | http://arxiv.org/abs/1704.05820v2 |
http://arxiv.org/pdf/1704.05820v2.pdf | |
PWC | https://paperswithcode.com/paper/noise-tolerant-interactive-learning-from |
Repo | |
Framework | |
Real-time 3D Human Tracking for Mobile Robots with Multisensors
Title | Real-time 3D Human Tracking for Mobile Robots with Multisensors |
Authors | Mengmeng Wang, Daobilige Su, Lei Shi, Yong Liu, Jaime Valls Miro |
Abstract | Acquiring the accurate 3-D position of a target person around a robot provides fundamental and valuable information that is applicable to a wide range of robotic tasks, including home service, navigation and entertainment. This paper presents a real-time robotic 3-D human tracking system which combines a monocular camera with an ultrasonic sensor by the extended Kalman filter (EKF). The proposed system consists of three sub-modules: monocular camera sensor tracking model, ultrasonic sensor tracking model and multi-sensor fusion. An improved visual tracking algorithm is presented to provide partial location estimation (2-D). The algorithm is designed to overcome severe occlusions, scale variation, target missing and achieve robust re-detection. The scale accuracy is further enhanced by the estimated 3-D information. An ultrasonic sensor array is employed to provide the range information from the target person to the robot and Gaussian Process Regression is used for partial location estimation (2-D). EKF is adopted to sequentially process multiple, heterogeneous measurements arriving in an asynchronous order from the vision sensor and the ultrasonic sensor separately. In the experiments, the proposed tracking system is tested in both simulation platform and actual mobile robot for various indoor and outdoor scenes. The experimental results show the superior performance of the 3-D tracking system in terms of both the accuracy and robustness. |
Tasks | Sensor Fusion, Visual Tracking |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.04877v1 |
http://arxiv.org/pdf/1703.04877v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-3d-human-tracking-for-mobile-robots |
Repo | |
Framework | |
Sensor Fusion for Robot Control through Deep Reinforcement Learning
Title | Sensor Fusion for Robot Control through Deep Reinforcement Learning |
Authors | Steven Bohez, Tim Verbelen, Elias De Coninck, Bert Vankeirsbilck, Pieter Simoens, Bart Dhoedt |
Abstract | Deep reinforcement learning is becoming increasingly popular for robot control algorithms, with the aim for a robot to self-learn useful feature representations from unstructured sensory input leading to the optimal actuation policy. In addition to sensors mounted on the robot, sensors might also be deployed in the environment, although these might need to be accessed via an unreliable wireless connection. In this paper, we demonstrate deep neural network architectures that are able to fuse information coming from multiple sensors and are robust to sensor failures at runtime. We evaluate our method on a search and pick task for a robot both in simulation and the real world. |
Tasks | Sensor Fusion |
Published | 2017-03-13 |
URL | http://arxiv.org/abs/1703.04550v1 |
http://arxiv.org/pdf/1703.04550v1.pdf | |
PWC | https://paperswithcode.com/paper/sensor-fusion-for-robot-control-through-deep |
Repo | |
Framework | |
Inversion using a new low-dimensional representation of complex binary geological media based on a deep neural network
Title | Inversion using a new low-dimensional representation of complex binary geological media based on a deep neural network |
Authors | Eric Laloy, Romain Hérault, John Lee, Diederik Jacques, Niklas Linde |
Abstract | Efficient and high-fidelity prior sampling and inversion for complex geological media is still a largely unsolved challenge. Here, we use a deep neural network of the variational autoencoder type to construct a parametric low-dimensional base model parameterization of complex binary geological media. For inversion purposes, it has the attractive feature that random draws from an uncorrelated standard normal distribution yield model realizations with spatial characteristics that are in agreement with the training set. In comparison with the most commonly used parametric representations in probabilistic inversion, we find that our dimensionality reduction (DR) approach outperforms principle component analysis (PCA), optimization-PCA (OPCA) and discrete cosine transform (DCT) DR techniques for unconditional geostatistical simulation of a channelized prior model. For the considered examples, important compression ratios (200 - 500) are achieved. Given that the construction of our parameterization requires a training set of several tens of thousands of prior model realizations, our DR approach is more suited for probabilistic (or deterministic) inversion than for unconditional (or point-conditioned) geostatistical simulation. Probabilistic inversions of 2D steady-state and 3D transient hydraulic tomography data are used to demonstrate the DR-based inversion. For the 2D case study, the performance is superior compared to current state-of-the-art multiple-point statistics inversion by sequential geostatistical resampling (SGR). Inversion results for the 3D application are also encouraging. |
Tasks | Dimensionality Reduction |
Published | 2017-10-25 |
URL | http://arxiv.org/abs/1710.09196v1 |
http://arxiv.org/pdf/1710.09196v1.pdf | |
PWC | https://paperswithcode.com/paper/inversion-using-a-new-low-dimensional |
Repo | |
Framework | |
End-to-end Binary Representation Learning via Direct Binary Embedding
Title | End-to-end Binary Representation Learning via Direct Binary Embedding |
Authors | Liu Liu, Alireza Rahimpour, Ali Taalimi, Hairong Qi |
Abstract | Learning binary representation is essential to large-scale computer vision tasks. Most existing algorithms require a separate quantization constraint to learn effective hashing functions. In this work, we present Direct Binary Embedding (DBE), a simple yet very effective algorithm to learn binary representation in an end-to-end fashion. By appending an ingeniously designed DBE layer to the deep convolutional neural network (DCNN), DBE learns binary code directly from the continuous DBE layer activation without quantization error. By employing the deep residual network (ResNet) as DCNN component, DBE captures rich semantics from images. Furthermore, in the effort of handling multilabel images, we design a joint cross entropy loss that includes both softmax cross entropy and weighted binary cross entropy in consideration of the correlation and independence of labels, respectively. Extensive experiments demonstrate the significant superiority of DBE over state-of-the-art methods on tasks of natural object recognition, image retrieval and image annotation. |
Tasks | Image Retrieval, Object Recognition, Quantization, Representation Learning |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.04960v2 |
http://arxiv.org/pdf/1703.04960v2.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-binary-representation-learning-via |
Repo | |
Framework | |