Paper Group ANR 60
OCRAPOSE II: An OCR-based indoor positioning system using mobile phone images. Convex Geometry of the Generalized Matrix-Fractional Function. Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline. Semi-Supervised Deep Learning for Monocular Depth Map Prediction. Multispectral and Hyperspectral Image Fusion Using a 3- …
OCRAPOSE II: An OCR-based indoor positioning system using mobile phone images
Title | OCRAPOSE II: An OCR-based indoor positioning system using mobile phone images |
Authors | Hamed Sadeghi, Shahrokh Valaee, Shahram Shirani |
Abstract | In this paper, we propose an OCR (optical character recognition)-based localization system called OCRAPOSE II, which is applicable in a number of indoor scenarios including office buildings, parkings, airports, grocery stores, etc. In these scenarios, characters (i.e. texts or numbers) can be used as suitable distinctive landmarks for localization. The proposed system takes advantage of OCR to read these characters in the query still images and provides a rough location estimate using a floor plan. Then, it finds depth and angle-of-view of the query using the information provided by the OCR engine in order to refine the location estimate. We derive novel formulas for the query angle-of-view and depth estimation using image line segments and the OCR box information. We demonstrate the applicability and effectiveness of the proposed system through experiments in indoor scenarios. It is shown that our system demonstrates better performance compared to the state-of-the-art benchmarks in terms of location recognition rate and average localization error specially under sparse database condition. |
Tasks | Depth Estimation, Optical Character Recognition |
Published | 2017-04-19 |
URL | http://arxiv.org/abs/1704.05591v1 |
http://arxiv.org/pdf/1704.05591v1.pdf | |
PWC | https://paperswithcode.com/paper/ocrapose-ii-an-ocr-based-indoor-positioning |
Repo | |
Framework | |
Convex Geometry of the Generalized Matrix-Fractional Function
Title | Convex Geometry of the Generalized Matrix-Fractional Function |
Authors | James V. Burke, Yuan Gao, Tim Hoheisel |
Abstract | Generalized matrix-fractional (GMF) functions are a class of matrix support functions introduced by Burke and Hoheisel as a tool for unifying a range of seemingly divergent matrix optimization problems associated with inverse problems, regularization and learning. In this paper we dramatically simplify the support function representation for GMF functions as well as the representation of their subdifferentials. These new representations allow the ready computation of a range of important related geometric objects whose formulations were previously unavailable. |
Tasks | |
Published | 2017-03-04 |
URL | http://arxiv.org/abs/1703.01363v1 |
http://arxiv.org/pdf/1703.01363v1.pdf | |
PWC | https://paperswithcode.com/paper/convex-geometry-of-the-generalized-matrix |
Repo | |
Framework | |
Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline
Title | Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline |
Authors | Keith Feldman, Louis Faust, Xian Wu, Chao Huang, Nitesh V. Chawla |
Abstract | From medical charts to national census, healthcare has traditionally operated under a paper-based paradigm. However, the past decade has marked a long and arduous transformation bringing healthcare into the digital age. Ranging from electronic health records, to digitized imaging and laboratory reports, to public health datasets, today, healthcare now generates an incredible amount of digital information. Such a wealth of data presents an exciting opportunity for integrated machine learning solutions to address problems across multiple facets of healthcare practice and administration. Unfortunately, the ability to derive accurate and informative insights requires more than the ability to execute machine learning models. Rather, a deeper understanding of the data on which the models are run is imperative for their success. While a significant effort has been undertaken to develop models able to process the volume of data obtained during the analysis of millions of digitalized patient records, it is important to remember that volume represents only one aspect of the data. In fact, drawing on data from an increasingly diverse set of sources, healthcare data presents an incredibly complex set of attributes that must be accounted for throughout the machine learning pipeline. This chapter focuses on highlighting such challenges, and is broken down into three distinct components, each representing a phase of the pipeline. We begin with attributes of the data accounted for during preprocessing, then move to considerations during model building, and end with challenges to the interpretation of model output. For each component, we present a discussion around data as it relates to the healthcare domain and offer insight into the challenges each may impose on the efficiency of machine learning techniques. |
Tasks | |
Published | 2017-06-01 |
URL | http://arxiv.org/abs/1706.01513v2 |
http://arxiv.org/pdf/1706.01513v2.pdf | |
PWC | https://paperswithcode.com/paper/beyond-volume-the-impact-of-complex |
Repo | |
Framework | |
Semi-Supervised Deep Learning for Monocular Depth Map Prediction
Title | Semi-Supervised Deep Learning for Monocular Depth Map Prediction |
Authors | Yevhen Kuznietsov, Jörg Stückler, Bastian Leibe |
Abstract | Supervised deep learning often suffers from the lack of sufficient training data. Specifically in the context of monocular depth map prediction, it is barely possible to determine dense ground truth depth images in realistic dynamic outdoor environments. When using LiDAR sensors, for instance, noise is present in the distance measurements, the calibration between sensors cannot be perfect, and the measurements are typically much sparser than the camera images. In this paper, we propose a novel approach to depth map prediction from monocular images that learns in a semi-supervised way. While we use sparse ground-truth depth for supervised learning, we also enforce our deep network to produce photoconsistent dense depth maps in a stereo setup using a direct image alignment loss. In experiments we demonstrate superior performance in depth map prediction from single images compared to the state-of-the-art methods. |
Tasks | Calibration |
Published | 2017-02-09 |
URL | http://arxiv.org/abs/1702.02706v3 |
http://arxiv.org/pdf/1702.02706v3.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-deep-learning-for-monocular |
Repo | |
Framework | |
Multispectral and Hyperspectral Image Fusion Using a 3-D-Convolutional Neural Network
Title | Multispectral and Hyperspectral Image Fusion Using a 3-D-Convolutional Neural Network |
Authors | Frosti Palsson, Johannes R. Sveinsson, Magnus O. Ulfarsson |
Abstract | In this paper, we propose a method using a three dimensional convolutional neural network (3-D-CNN) to fuse together multispectral (MS) and hyperspectral (HS) images to obtain a high resolution hyperspectral image. Dimensionality reduction of the hyperspectral image is performed prior to fusion in order to significantly reduce the computational time and make the method more robust to noise. Experiments are performed on a data set simulated using a real hyperspectral image. The results obtained show that the proposed approach is very promising when compared to conventional methods. This is especially true when the hyperspectral image is corrupted by additive noise. |
Tasks | Dimensionality Reduction |
Published | 2017-06-16 |
URL | http://arxiv.org/abs/1706.05249v1 |
http://arxiv.org/pdf/1706.05249v1.pdf | |
PWC | https://paperswithcode.com/paper/multispectral-and-hyperspectral-image-fusion |
Repo | |
Framework | |
A Flow Model of Neural Networks
Title | A Flow Model of Neural Networks |
Authors | Zhen Li, Zuoqiang Shi |
Abstract | Based on a natural connection between ResNet and transport equation or its characteristic equation, we propose a continuous flow model for both ResNet and plain net. Through this continuous model, a ResNet can be explicitly constructed as a refinement of a plain net. The flow model provides an alternative perspective to understand phenomena in deep neural networks, such as why it is necessary and sufficient to use 2-layer blocks in ResNets, why deeper is better, and why ResNets are even deeper, and so on. It also opens a gate to bring in more tools from the huge area of differential equations. |
Tasks | |
Published | 2017-08-21 |
URL | http://arxiv.org/abs/1708.06257v2 |
http://arxiv.org/pdf/1708.06257v2.pdf | |
PWC | https://paperswithcode.com/paper/a-flow-model-of-neural-networks |
Repo | |
Framework | |
Convolutional Neural Networks for Medical Diagnosis from Admission Notes
Title | Convolutional Neural Networks for Medical Diagnosis from Admission Notes |
Authors | Christy Li, Dimitris Konomis, Graham Neubig, Pengtao Xie, Carol Cheng, Eric Xing |
Abstract | $\textbf{Objective}$ Develop an automatic diagnostic system which only uses textual admission information from Electronic Health Records (EHRs) and assist clinicians with a timely and statistically proved decision tool. The hope is that the tool can be used to reduce mis-diagnosis. $\textbf{Materials and Methods}$ We use the real-world clinical notes from MIMIC-III, a freely available dataset consisting of clinical data of more than forty thousand patients who stayed in intensive care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. We proposed a Convolutional Neural Network model to learn semantic features from unstructured textual input and automatically predict primary discharge diagnosis. $\textbf{Results}$ The proposed model achieved an overall 96.11% accuracy and 80.48% weighted F1 score values on 10 most frequent disease classes, significantly outperforming four strong baseline models by at least 12.7% in weighted F1 score. $\textbf{Discussion}$ Experimental results imply that the CNN model is suitable for supporting diagnosis decision making in the presence of complex, noisy and unstructured clinical data while at the same time using fewer layers and parameters that other traditional Deep Network models. $\textbf{Conclusion}$ Our model demonstrated capability of representing complex medical meaningful features from unstructured clinical notes and prediction power for commonly misdiagnosed frequent diseases. It can use easily adopted in clinical setting to provide timely and statistically proved decision support. $\textbf{Keywords}$ Convolutional neural network, text classification, discharge diagnosis prediction, admission information from EHRs. |
Tasks | Decision Making, Medical Diagnosis, Text Classification |
Published | 2017-12-06 |
URL | http://arxiv.org/abs/1712.02768v1 |
http://arxiv.org/pdf/1712.02768v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-networks-for-medical-1 |
Repo | |
Framework | |
Hashing in the Zero Shot Framework with Domain Adaptation
Title | Hashing in the Zero Shot Framework with Domain Adaptation |
Authors | Shubham Pachori, Ameya Deshpande, Shanmuganathan Raman |
Abstract | Techniques to learn hash codes which can store and retrieve large dimensional multimedia data efficiently have attracted broad research interests in the recent years. With rapid explosion of newly emerged concepts and online data, existing supervised hashing algorithms suffer from the problem of scarcity of ground truth annotations due to the high cost of obtaining manual annotations. Therefore, we propose an algorithm to learn a hash function from training images belonging to seen' classes which can efficiently encode images of unseen’ classes to binary codes. Specifically, we project the image features from visual space and semantic features from semantic space into a common Hamming subspace. Earlier works to generate hash codes have tried to relax the discrete constraints on hash codes and solve the continuous optimization problem. However, it often leads to quantization errors. In this work, we use the max-margin classifier to learn an efficient hash function. To address the concern of domain-shift which may arise due to the introduction of new classes, we also introduce an unsupervised domain adaptation model in the proposed hashing framework. Results on the three datasets show the advantage of using domain adaptation in learning a high-quality hash function and superiority of our method for the task of image retrieval performance as compared to several state-of-the-art hashing methods. |
Tasks | Domain Adaptation, Image Retrieval, Quantization, Unsupervised Domain Adaptation |
Published | 2017-02-07 |
URL | http://arxiv.org/abs/1702.01933v2 |
http://arxiv.org/pdf/1702.01933v2.pdf | |
PWC | https://paperswithcode.com/paper/hashing-in-the-zero-shot-framework-with |
Repo | |
Framework | |
Joint Learning of Set Cardinality and State Distribution
Title | Joint Learning of Set Cardinality and State Distribution |
Authors | S. Hamid Rezatofighi, Anton Milan, Qinfeng Shi, Anthony Dick, Ian Reid |
Abstract | We present a novel approach for learning to predict sets using deep learning. In recent years, deep neural networks have shown remarkable results in computer vision, natural language processing and other related problems. Despite their success, traditional architectures suffer from a serious limitation in that they are built to deal with structured input and output data, i.e. vectors or matrices. Many real-world problems, however, are naturally described as sets, rather than vectors. Existing techniques that allow for sequential data, such as recurrent neural networks, typically heavily depend on the input and output order and do not guarantee a valid solution. Here, we derive in a principled way, a mathematical formulation for set prediction where the output is permutation invariant. In particular, our approach jointly learns both the cardinality and the state distribution of the target set. We demonstrate the validity of our method on the task of multi-label image classification and achieve a new state of the art on the PASCAL VOC and MS COCO datasets. |
Tasks | Image Classification |
Published | 2017-09-13 |
URL | http://arxiv.org/abs/1709.04093v2 |
http://arxiv.org/pdf/1709.04093v2.pdf | |
PWC | https://paperswithcode.com/paper/joint-learning-of-set-cardinality-and-state |
Repo | |
Framework | |
Decorrelated Jet Substructure Tagging using Adversarial Neural Networks
Title | Decorrelated Jet Substructure Tagging using Adversarial Neural Networks |
Authors | Chase Shimmin, Peter Sadowski, Pierre Baldi, Edison Weik, Daniel Whiteson, Edward Goul, Andreas Søgaard |
Abstract | We describe a strategy for constructing a neural network jet substructure tagger which powerfully discriminates boosted decay signals while remaining largely uncorrelated with the jet mass. This reduces the impact of systematic uncertainties in background modeling while enhancing signal purity, resulting in improved discovery significance relative to existing taggers. The network is trained using an adversarial strategy, resulting in a tagger that learns to balance classification accuracy with decorrelation. As a benchmark scenario, we consider the case where large-radius jets originating from a boosted resonance decay are discriminated from a background of nonresonant quark and gluon jets. We show that in the presence of systematic uncertainties on the background rate, our adversarially-trained, decorrelated tagger considerably outperforms a conventionally trained neural network, despite having a slightly worse signal-background separation power. We generalize the adversarial training technique to include a parametric dependence on the signal hypothesis, training a single network that provides optimized, interpolatable decorrelated jet tagging across a continuous range of hypothetical resonance masses, after training on discrete choices of the signal mass. |
Tasks | |
Published | 2017-03-10 |
URL | http://arxiv.org/abs/1703.03507v1 |
http://arxiv.org/pdf/1703.03507v1.pdf | |
PWC | https://paperswithcode.com/paper/decorrelated-jet-substructure-tagging-using |
Repo | |
Framework | |
DeepStory: Video Story QA by Deep Embedded Memory Networks
Title | DeepStory: Video Story QA by Deep Embedded Memory Networks |
Authors | Kyung-Min Kim, Min-Oh Heo, Seong-Ho Choi, Byoung-Tak Zhang |
Abstract | Question-answering (QA) on video contents is a significant challenge for achieving human-level intelligence as it involves both vision and language in real-world settings. Here we demonstrate the possibility of an AI agent performing video story QA by learning from a large amount of cartoon videos. We develop a video-story learning model, i.e. Deep Embedded Memory Networks (DEMN), to reconstruct stories from a joint scene-dialogue video stream using a latent embedding space of observed data. The video stories are stored in a long-term memory component. For a given question, an LSTM-based attention model uses the long-term memory to recall the best question-story-answer triplet by focusing on specific words containing key information. We trained the DEMN on a novel QA dataset of children’s cartoon video series, Pororo. The dataset contains 16,066 scene-dialogue pairs of 20.5-hour videos, 27,328 fine-grained sentences for scene description, and 8,913 story-related QA pairs. Our experimental results show that the DEMN outperforms other QA models. This is mainly due to 1) the reconstruction of video stories in a scene-dialogue combined form that utilize the latent embedding and 2) attention. DEMN also achieved state-of-the-art results on the MovieQA benchmark. |
Tasks | Question Answering, Video Story QA |
Published | 2017-07-04 |
URL | http://arxiv.org/abs/1707.00836v1 |
http://arxiv.org/pdf/1707.00836v1.pdf | |
PWC | https://paperswithcode.com/paper/deepstory-video-story-qa-by-deep-embedded |
Repo | |
Framework | |
Diversity driven Attention Model for Query-based Abstractive Summarization
Title | Diversity driven Attention Model for Query-based Abstractive Summarization |
Authors | Preksha Nema, Mitesh Khapra, Anirban Laha, Balaraman Ravindran |
Abstract | Abstractive summarization aims to generate a shorter version of the document covering all the salient points in a compact and coherent fashion. On the other hand, query-based summarization highlights those points that are relevant in the context of a given query. The encode-attend-decode paradigm has achieved notable success in machine translation, extractive summarization, dialog systems, etc. But it suffers from the drawback of generation of repeated phrases. In this work we propose a model for the query-based summarization task based on the encode-attend-decode paradigm with two key additions (i) a query attention model (in addition to document attention model) which learns to focus on different portions of the query at different time steps (instead of using a static representation for the query) and (ii) a new diversity based attention model which aims to alleviate the problem of repeating phrases in the summary. In order to enable the testing of this model we introduce a new query-based summarization dataset building on debatepedia. Our experiments show that with these two additions the proposed model clearly outperforms vanilla encode-attend-decode models with a gain of 28% (absolute) in ROUGE-L scores. |
Tasks | Abstractive Text Summarization, Machine Translation |
Published | 2017-04-26 |
URL | http://arxiv.org/abs/1704.08300v2 |
http://arxiv.org/pdf/1704.08300v2.pdf | |
PWC | https://paperswithcode.com/paper/diversity-driven-attention-model-for-query |
Repo | |
Framework | |
Composition of Credal Sets via Polyhedral Geometry
Title | Composition of Credal Sets via Polyhedral Geometry |
Authors | Jiřina Vejnarová, Václav Kratochvíl |
Abstract | Recently introduced composition operator for credal sets is an analogy of such operators in probability, possibility, evidence and valuation-based systems theories. It was designed to construct multidimensional models (in the framework of credal sets) from a system of low- dimensional credal sets. In this paper we study its potential from the computational point of view utilizing methods of polyhedral geometry. |
Tasks | |
Published | 2017-05-05 |
URL | http://arxiv.org/abs/1705.03352v1 |
http://arxiv.org/pdf/1705.03352v1.pdf | |
PWC | https://paperswithcode.com/paper/composition-of-credal-sets-via-polyhedral |
Repo | |
Framework | |
Assessing the Performance of Deep Learning Algorithms for Newsvendor Problem
Title | Assessing the Performance of Deep Learning Algorithms for Newsvendor Problem |
Authors | Yanfei Zhang, Junbin Gao |
Abstract | In retailer management, the Newsvendor problem has widely attracted attention as one of basic inventory models. In the traditional approach to solving this problem, it relies on the probability distribution of the demand. In theory, if the probability distribution is known, the problem can be considered as fully solved. However, in any real world scenario, it is almost impossible to even approximate or estimate a better probability distribution for the demand. In recent years, researchers start adopting machine learning approach to learn a demand prediction model by using other feature information. In this paper, we propose a supervised learning that optimizes the demand quantities for products based on feature information. We demonstrate that the original Newsvendor loss function as the training objective outperforms the recently suggested quadratic loss function. The new algorithm has been assessed on both the synthetic data and real-world data, demonstrating better performance. |
Tasks | |
Published | 2017-06-09 |
URL | http://arxiv.org/abs/1706.02899v1 |
http://arxiv.org/pdf/1706.02899v1.pdf | |
PWC | https://paperswithcode.com/paper/assessing-the-performance-of-deep-learning |
Repo | |
Framework | |
Higher-order clustering in networks
Title | Higher-order clustering in networks |
Authors | Hao Yin, Austin R. Benson, Jure Leskovec |
Abstract | A fundamental property of complex networks is the tendency for edges to cluster. The extent of the clustering is typically quantified by the clustering coefficient, which is the probability that a length-2 path is closed, i.e., induces a triangle in the network. However, higher-order cliques beyond triangles are crucial to understanding complex networks, and the clustering behavior with respect to such higher-order network structures is not well understood. Here we introduce higher-order clustering coefficients that measure the closure probability of higher-order network cliques and provide a more comprehensive view of how the edges of complex networks cluster. Our higher-order clustering coefficients are a natural generalization of the traditional clustering coefficient. We derive several properties about higher-order clustering coefficients and analyze them under common random graph models. Finally, we use higher-order clustering coefficients to gain new insights into the structure of real-world networks from several domains. |
Tasks | |
Published | 2017-04-12 |
URL | http://arxiv.org/abs/1704.03913v2 |
http://arxiv.org/pdf/1704.03913v2.pdf | |
PWC | https://paperswithcode.com/paper/higher-order-clustering-in-networks |
Repo | |
Framework | |