July 29, 2019

3022 words 15 mins read

Paper Group ANR 60

OCRAPOSE II: An OCR-based indoor positioning system using mobile phone images. Convex Geometry of the Generalized Matrix-Fractional Function. Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline. Semi-Supervised Deep Learning for Monocular Depth Map Prediction. Multispectral and Hyperspectral Image Fusion Using a 3- …

OCRAPOSE II: An OCR-based indoor positioning system using mobile phone images


Title	OCRAPOSE II: An OCR-based indoor positioning system using mobile phone images
Authors	Hamed Sadeghi, Shahrokh Valaee, Shahram Shirani
Abstract	In this paper, we propose an OCR (optical character recognition)-based localization system called OCRAPOSE II, which is applicable in a number of indoor scenarios including office buildings, parkings, airports, grocery stores, etc. In these scenarios, characters (i.e. texts or numbers) can be used as suitable distinctive landmarks for localization. The proposed system takes advantage of OCR to read these characters in the query still images and provides a rough location estimate using a floor plan. Then, it finds depth and angle-of-view of the query using the information provided by the OCR engine in order to refine the location estimate. We derive novel formulas for the query angle-of-view and depth estimation using image line segments and the OCR box information. We demonstrate the applicability and effectiveness of the proposed system through experiments in indoor scenarios. It is shown that our system demonstrates better performance compared to the state-of-the-art benchmarks in terms of location recognition rate and average localization error specially under sparse database condition.
Tasks	Depth Estimation, Optical Character Recognition
Published	2017-04-19
URL	http://arxiv.org/abs/1704.05591v1
PDF	http://arxiv.org/pdf/1704.05591v1.pdf
PWC	https://paperswithcode.com/paper/ocrapose-ii-an-ocr-based-indoor-positioning
Repo
Framework

Convex Geometry of the Generalized Matrix-Fractional Function


Title	Convex Geometry of the Generalized Matrix-Fractional Function
Authors	James V. Burke, Yuan Gao, Tim Hoheisel
Abstract	Generalized matrix-fractional (GMF) functions are a class of matrix support functions introduced by Burke and Hoheisel as a tool for unifying a range of seemingly divergent matrix optimization problems associated with inverse problems, regularization and learning. In this paper we dramatically simplify the support function representation for GMF functions as well as the representation of their subdifferentials. These new representations allow the ready computation of a range of important related geometric objects whose formulations were previously unavailable.
Tasks
Published	2017-03-04
URL	http://arxiv.org/abs/1703.01363v1
PDF	http://arxiv.org/pdf/1703.01363v1.pdf
PWC	https://paperswithcode.com/paper/convex-geometry-of-the-generalized-matrix
Repo
Framework

Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline


Title	Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline
Authors	Keith Feldman, Louis Faust, Xian Wu, Chao Huang, Nitesh V. Chawla
Abstract	From medical charts to national census, healthcare has traditionally operated under a paper-based paradigm. However, the past decade has marked a long and arduous transformation bringing healthcare into the digital age. Ranging from electronic health records, to digitized imaging and laboratory reports, to public health datasets, today, healthcare now generates an incredible amount of digital information. Such a wealth of data presents an exciting opportunity for integrated machine learning solutions to address problems across multiple facets of healthcare practice and administration. Unfortunately, the ability to derive accurate and informative insights requires more than the ability to execute machine learning models. Rather, a deeper understanding of the data on which the models are run is imperative for their success. While a significant effort has been undertaken to develop models able to process the volume of data obtained during the analysis of millions of digitalized patient records, it is important to remember that volume represents only one aspect of the data. In fact, drawing on data from an increasingly diverse set of sources, healthcare data presents an incredibly complex set of attributes that must be accounted for throughout the machine learning pipeline. This chapter focuses on highlighting such challenges, and is broken down into three distinct components, each representing a phase of the pipeline. We begin with attributes of the data accounted for during preprocessing, then move to considerations during model building, and end with challenges to the interpretation of model output. For each component, we present a discussion around data as it relates to the healthcare domain and offer insight into the challenges each may impose on the efficiency of machine learning techniques.
Tasks
Published	2017-06-01
URL	http://arxiv.org/abs/1706.01513v2
PDF	http://arxiv.org/pdf/1706.01513v2.pdf
PWC	https://paperswithcode.com/paper/beyond-volume-the-impact-of-complex
Repo
Framework

Semi-Supervised Deep Learning for Monocular Depth Map Prediction


Title	Semi-Supervised Deep Learning for Monocular Depth Map Prediction
Authors	Yevhen Kuznietsov, Jörg Stückler, Bastian Leibe
Abstract	Supervised deep learning often suffers from the lack of sufficient training data. Specifically in the context of monocular depth map prediction, it is barely possible to determine dense ground truth depth images in realistic dynamic outdoor environments. When using LiDAR sensors, for instance, noise is present in the distance measurements, the calibration between sensors cannot be perfect, and the measurements are typically much sparser than the camera images. In this paper, we propose a novel approach to depth map prediction from monocular images that learns in a semi-supervised way. While we use sparse ground-truth depth for supervised learning, we also enforce our deep network to produce photoconsistent dense depth maps in a stereo setup using a direct image alignment loss. In experiments we demonstrate superior performance in depth map prediction from single images compared to the state-of-the-art methods.
Tasks	Calibration
Published	2017-02-09
URL	http://arxiv.org/abs/1702.02706v3
PDF	http://arxiv.org/pdf/1702.02706v3.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-deep-learning-for-monocular
Repo
Framework

Multispectral and Hyperspectral Image Fusion Using a 3-D-Convolutional Neural Network


Title	Multispectral and Hyperspectral Image Fusion Using a 3-D-Convolutional Neural Network
Authors	Frosti Palsson, Johannes R. Sveinsson, Magnus O. Ulfarsson
Abstract	In this paper, we propose a method using a three dimensional convolutional neural network (3-D-CNN) to fuse together multispectral (MS) and hyperspectral (HS) images to obtain a high resolution hyperspectral image. Dimensionality reduction of the hyperspectral image is performed prior to fusion in order to significantly reduce the computational time and make the method more robust to noise. Experiments are performed on a data set simulated using a real hyperspectral image. The results obtained show that the proposed approach is very promising when compared to conventional methods. This is especially true when the hyperspectral image is corrupted by additive noise.
Tasks	Dimensionality Reduction
Published	2017-06-16
URL	http://arxiv.org/abs/1706.05249v1
PDF	http://arxiv.org/pdf/1706.05249v1.pdf
PWC	https://paperswithcode.com/paper/multispectral-and-hyperspectral-image-fusion
Repo
Framework

A Flow Model of Neural Networks


Title	A Flow Model of Neural Networks
Authors	Zhen Li, Zuoqiang Shi
Abstract	Based on a natural connection between ResNet and transport equation or its characteristic equation, we propose a continuous flow model for both ResNet and plain net. Through this continuous model, a ResNet can be explicitly constructed as a refinement of a plain net. The flow model provides an alternative perspective to understand phenomena in deep neural networks, such as why it is necessary and sufficient to use 2-layer blocks in ResNets, why deeper is better, and why ResNets are even deeper, and so on. It also opens a gate to bring in more tools from the huge area of differential equations.
Tasks
Published	2017-08-21
URL	http://arxiv.org/abs/1708.06257v2
PDF	http://arxiv.org/pdf/1708.06257v2.pdf
PWC	https://paperswithcode.com/paper/a-flow-model-of-neural-networks
Repo
Framework

Convolutional Neural Networks for Medical Diagnosis from Admission Notes


Title	Convolutional Neural Networks for Medical Diagnosis from Admission Notes
Authors	Christy Li, Dimitris Konomis, Graham Neubig, Pengtao Xie, Carol Cheng, Eric Xing
Abstract	$\textbf{Objective}$ Develop an automatic diagnostic system which only uses textual admission information from Electronic Health Records (EHRs) and assist clinicians with a timely and statistically proved decision tool. The hope is that the tool can be used to reduce mis-diagnosis. $\textbf{Materials and Methods}$ We use the real-world clinical notes from MIMIC-III, a freely available dataset consisting of clinical data of more than forty thousand patients who stayed in intensive care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. We proposed a Convolutional Neural Network model to learn semantic features from unstructured textual input and automatically predict primary discharge diagnosis. $\textbf{Results}$ The proposed model achieved an overall 96.11% accuracy and 80.48% weighted F1 score values on 10 most frequent disease classes, significantly outperforming four strong baseline models by at least 12.7% in weighted F1 score. $\textbf{Discussion}$ Experimental results imply that the CNN model is suitable for supporting diagnosis decision making in the presence of complex, noisy and unstructured clinical data while at the same time using fewer layers and parameters that other traditional Deep Network models. $\textbf{Conclusion}$ Our model demonstrated capability of representing complex medical meaningful features from unstructured clinical notes and prediction power for commonly misdiagnosed frequent diseases. It can use easily adopted in clinical setting to provide timely and statistically proved decision support. $\textbf{Keywords}$ Convolutional neural network, text classification, discharge diagnosis prediction, admission information from EHRs.
Tasks	Decision Making, Medical Diagnosis, Text Classification
Published	2017-12-06
URL	http://arxiv.org/abs/1712.02768v1
PDF	http://arxiv.org/pdf/1712.02768v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-networks-for-medical-1
Repo
Framework

Hashing in the Zero Shot Framework with Domain Adaptation


Title	Hashing in the Zero Shot Framework with Domain Adaptation
Authors	Shubham Pachori, Ameya Deshpande, Shanmuganathan Raman
Abstract	Techniques to learn hash codes which can store and retrieve large dimensional multimedia data efficiently have attracted broad research interests in the recent years. With rapid explosion of newly emerged concepts and online data, existing supervised hashing algorithms suffer from the problem of scarcity of ground truth annotations due to the high cost of obtaining manual annotations. Therefore, we propose an algorithm to learn a hash function from training images belonging to `seen' classes which can efficiently encode images of` unseen’ classes to binary codes. Specifically, we project the image features from visual space and semantic features from semantic space into a common Hamming subspace. Earlier works to generate hash codes have tried to relax the discrete constraints on hash codes and solve the continuous optimization problem. However, it often leads to quantization errors. In this work, we use the max-margin classifier to learn an efficient hash function. To address the concern of domain-shift which may arise due to the introduction of new classes, we also introduce an unsupervised domain adaptation model in the proposed hashing framework. Results on the three datasets show the advantage of using domain adaptation in learning a high-quality hash function and superiority of our method for the task of image retrieval performance as compared to several state-of-the-art hashing methods.
Tasks	Domain Adaptation, Image Retrieval, Quantization, Unsupervised Domain Adaptation
Published	2017-02-07
URL	http://arxiv.org/abs/1702.01933v2
PDF	http://arxiv.org/pdf/1702.01933v2.pdf
PWC	https://paperswithcode.com/paper/hashing-in-the-zero-shot-framework-with
Repo
Framework

Joint Learning of Set Cardinality and State Distribution


Title	Joint Learning of Set Cardinality and State Distribution
Authors	S. Hamid Rezatofighi, Anton Milan, Qinfeng Shi, Anthony Dick, Ian Reid
Abstract	We present a novel approach for learning to predict sets using deep learning. In recent years, deep neural networks have shown remarkable results in computer vision, natural language processing and other related problems. Despite their success, traditional architectures suffer from a serious limitation in that they are built to deal with structured input and output data, i.e. vectors or matrices. Many real-world problems, however, are naturally described as sets, rather than vectors. Existing techniques that allow for sequential data, such as recurrent neural networks, typically heavily depend on the input and output order and do not guarantee a valid solution. Here, we derive in a principled way, a mathematical formulation for set prediction where the output is permutation invariant. In particular, our approach jointly learns both the cardinality and the state distribution of the target set. We demonstrate the validity of our method on the task of multi-label image classification and achieve a new state of the art on the PASCAL VOC and MS COCO datasets.
Tasks	Image Classification
Published	2017-09-13
URL	http://arxiv.org/abs/1709.04093v2
PDF	http://arxiv.org/pdf/1709.04093v2.pdf
PWC	https://paperswithcode.com/paper/joint-learning-of-set-cardinality-and-state
Repo
Framework

Decorrelated Jet Substructure Tagging using Adversarial Neural Networks


Title	Decorrelated Jet Substructure Tagging using Adversarial Neural Networks
Authors	Chase Shimmin, Peter Sadowski, Pierre Baldi, Edison Weik, Daniel Whiteson, Edward Goul, Andreas Søgaard
Abstract	We describe a strategy for constructing a neural network jet substructure tagger which powerfully discriminates boosted decay signals while remaining largely uncorrelated with the jet mass. This reduces the impact of systematic uncertainties in background modeling while enhancing signal purity, resulting in improved discovery significance relative to existing taggers. The network is trained using an adversarial strategy, resulting in a tagger that learns to balance classification accuracy with decorrelation. As a benchmark scenario, we consider the case where large-radius jets originating from a boosted resonance decay are discriminated from a background of nonresonant quark and gluon jets. We show that in the presence of systematic uncertainties on the background rate, our adversarially-trained, decorrelated tagger considerably outperforms a conventionally trained neural network, despite having a slightly worse signal-background separation power. We generalize the adversarial training technique to include a parametric dependence on the signal hypothesis, training a single network that provides optimized, interpolatable decorrelated jet tagging across a continuous range of hypothetical resonance masses, after training on discrete choices of the signal mass.
Tasks
Published	2017-03-10
URL	http://arxiv.org/abs/1703.03507v1
PDF	http://arxiv.org/pdf/1703.03507v1.pdf
PWC	https://paperswithcode.com/paper/decorrelated-jet-substructure-tagging-using
Repo
Framework

DeepStory: Video Story QA by Deep Embedded Memory Networks


Title	DeepStory: Video Story QA by Deep Embedded Memory Networks
Authors	Kyung-Min Kim, Min-Oh Heo, Seong-Ho Choi, Byoung-Tak Zhang
Abstract	Question-answering (QA) on video contents is a significant challenge for achieving human-level intelligence as it involves both vision and language in real-world settings. Here we demonstrate the possibility of an AI agent performing video story QA by learning from a large amount of cartoon videos. We develop a video-story learning model, i.e. Deep Embedded Memory Networks (DEMN), to reconstruct stories from a joint scene-dialogue video stream using a latent embedding space of observed data. The video stories are stored in a long-term memory component. For a given question, an LSTM-based attention model uses the long-term memory to recall the best question-story-answer triplet by focusing on specific words containing key information. We trained the DEMN on a novel QA dataset of children’s cartoon video series, Pororo. The dataset contains 16,066 scene-dialogue pairs of 20.5-hour videos, 27,328 fine-grained sentences for scene description, and 8,913 story-related QA pairs. Our experimental results show that the DEMN outperforms other QA models. This is mainly due to 1) the reconstruction of video stories in a scene-dialogue combined form that utilize the latent embedding and 2) attention. DEMN also achieved state-of-the-art results on the MovieQA benchmark.
Tasks	Question Answering, Video Story QA
Published	2017-07-04
URL	http://arxiv.org/abs/1707.00836v1
PDF	http://arxiv.org/pdf/1707.00836v1.pdf
PWC	https://paperswithcode.com/paper/deepstory-video-story-qa-by-deep-embedded
Repo
Framework

Diversity driven Attention Model for Query-based Abstractive Summarization


Title	Diversity driven Attention Model for Query-based Abstractive Summarization
Authors	Preksha Nema, Mitesh Khapra, Anirban Laha, Balaraman Ravindran
Abstract	Abstractive summarization aims to generate a shorter version of the document covering all the salient points in a compact and coherent fashion. On the other hand, query-based summarization highlights those points that are relevant in the context of a given query. The encode-attend-decode paradigm has achieved notable success in machine translation, extractive summarization, dialog systems, etc. But it suffers from the drawback of generation of repeated phrases. In this work we propose a model for the query-based summarization task based on the encode-attend-decode paradigm with two key additions (i) a query attention model (in addition to document attention model) which learns to focus on different portions of the query at different time steps (instead of using a static representation for the query) and (ii) a new diversity based attention model which aims to alleviate the problem of repeating phrases in the summary. In order to enable the testing of this model we introduce a new query-based summarization dataset building on debatepedia. Our experiments show that with these two additions the proposed model clearly outperforms vanilla encode-attend-decode models with a gain of 28% (absolute) in ROUGE-L scores.
Tasks	Abstractive Text Summarization, Machine Translation
Published	2017-04-26
URL	http://arxiv.org/abs/1704.08300v2
PDF	http://arxiv.org/pdf/1704.08300v2.pdf
PWC	https://paperswithcode.com/paper/diversity-driven-attention-model-for-query
Repo
Framework

Composition of Credal Sets via Polyhedral Geometry


Title	Composition of Credal Sets via Polyhedral Geometry
Authors	Jiřina Vejnarová, Václav Kratochvíl
Abstract	Recently introduced composition operator for credal sets is an analogy of such operators in probability, possibility, evidence and valuation-based systems theories. It was designed to construct multidimensional models (in the framework of credal sets) from a system of low- dimensional credal sets. In this paper we study its potential from the computational point of view utilizing methods of polyhedral geometry.
Tasks
Published	2017-05-05
URL	http://arxiv.org/abs/1705.03352v1
PDF	http://arxiv.org/pdf/1705.03352v1.pdf
PWC	https://paperswithcode.com/paper/composition-of-credal-sets-via-polyhedral
Repo
Framework

Assessing the Performance of Deep Learning Algorithms for Newsvendor Problem


Title	Assessing the Performance of Deep Learning Algorithms for Newsvendor Problem
Authors	Yanfei Zhang, Junbin Gao
Abstract	In retailer management, the Newsvendor problem has widely attracted attention as one of basic inventory models. In the traditional approach to solving this problem, it relies on the probability distribution of the demand. In theory, if the probability distribution is known, the problem can be considered as fully solved. However, in any real world scenario, it is almost impossible to even approximate or estimate a better probability distribution for the demand. In recent years, researchers start adopting machine learning approach to learn a demand prediction model by using other feature information. In this paper, we propose a supervised learning that optimizes the demand quantities for products based on feature information. We demonstrate that the original Newsvendor loss function as the training objective outperforms the recently suggested quadratic loss function. The new algorithm has been assessed on both the synthetic data and real-world data, demonstrating better performance.
Tasks
Published	2017-06-09
URL	http://arxiv.org/abs/1706.02899v1
PDF	http://arxiv.org/pdf/1706.02899v1.pdf
PWC	https://paperswithcode.com/paper/assessing-the-performance-of-deep-learning
Repo
Framework

Higher-order clustering in networks


Title	Higher-order clustering in networks
Authors	Hao Yin, Austin R. Benson, Jure Leskovec
Abstract	A fundamental property of complex networks is the tendency for edges to cluster. The extent of the clustering is typically quantified by the clustering coefficient, which is the probability that a length-2 path is closed, i.e., induces a triangle in the network. However, higher-order cliques beyond triangles are crucial to understanding complex networks, and the clustering behavior with respect to such higher-order network structures is not well understood. Here we introduce higher-order clustering coefficients that measure the closure probability of higher-order network cliques and provide a more comprehensive view of how the edges of complex networks cluster. Our higher-order clustering coefficients are a natural generalization of the traditional clustering coefficient. We derive several properties about higher-order clustering coefficients and analyze them under common random graph models. Finally, we use higher-order clustering coefficients to gain new insights into the structure of real-world networks from several domains.
Tasks
Published	2017-04-12
URL	http://arxiv.org/abs/1704.03913v2
PDF	http://arxiv.org/pdf/1704.03913v2.pdf
PWC	https://paperswithcode.com/paper/higher-order-clustering-in-networks
Repo
Framework