October 19, 2019

3247 words 16 mins read

Paper Group ANR 129

CIAN: Cross-Image Affinity Net for Weakly Supervised Semantic Segmentation. Medical code prediction with multi-view convolution and description-regularized label-dependent attention. Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning. SCPNet: Spatial-Channel Parallelism Network …

CIAN: Cross-Image Affinity Net for Weakly Supervised Semantic Segmentation


Title	CIAN: Cross-Image Affinity Net for Weakly Supervised Semantic Segmentation
Authors	Junsong Fan, Zhaoxiang Zhang, Tieniu Tan, Chunfeng Song, Jun Xiao
Abstract	Weakly supervised semantic segmentation with only image-level labels saves large human effort to annotate pixel-level labels. Cutting-edge approaches rely on various innovative constraints and heuristic rules to generate the masks for every single image. Although great progress has been achieved by these methods, they treat each image independently and do not take account of the relationships across different images. In this paper, however, we argue that the cross-image relationship is vital for weakly supervised segmentation. Because it connects related regions across images, where supplementary representations can be propagated to obtain more consistent and integral regions. To leverage this information, we propose an end-to-end cross-image affinity module, which exploits pixel-level cross-image relationships with only image-level labels. By means of this, our approach achieves 64.3% and 65.3% mIoU on Pascal VOC 2012 validation and test set respectively, which is a new state-of-the-art result by only using image-level labels for weakly supervised semantic segmentation, demonstrating the superiority of our approach.
Tasks	Semantic Segmentation, Weakly-Supervised Semantic Segmentation
Published	2018-11-27
URL	https://arxiv.org/abs/1811.10842v2
PDF	https://arxiv.org/pdf/1811.10842v2.pdf
PWC	https://paperswithcode.com/paper/cian-cross-image-affinity-net-for-weakly
Repo
Framework

Medical code prediction with multi-view convolution and description-regularized label-dependent attention


Title	Medical code prediction with multi-view convolution and description-regularized label-dependent attention
Authors	Najmeh Sadoughi, Greg P. Finley, James Fone, Vignesh Murali, Maxim Korenevski, Slava Baryshnikov, Nico Axtmann, Mark Miller, David Suendermann-Oeft
Abstract	A ubiquitous task in processing electronic medical data is the assignment of standardized codes representing diagnoses and/or procedures to free-text documents such as medical reports. This is a difficult natural language processing task that requires parsing long, heterogeneous documents and selecting a set of appropriate codes from tens of thousands of possibilities—many of which have very few positive training samples. We present a deep learning system that advances the state of the art for the MIMIC-III dataset, achieving a new best micro F1-measure of 55.85%, significantly outperforming the previous best result (Mullenbach et al. 2018). We achieve this through a number of enhancements, including two major novel contributions: multi-view convolutional channels, which effectively learn to adjust kernel sizes throughout the input; and attention regularization, mediated by natural-language code descriptions, which helps overcome sparsity for thousands of uncommon codes. These and other modifications are selected to address difficulties inherent to both automated coding specifically and deep learning generally. Finally, we investigate our accuracy results in detail to individually measure the impact of these contributions and point the way towards future algorithmic improvements.
Tasks	Medical Code Prediction
Published	2018-11-05
URL	http://arxiv.org/abs/1811.01468v1
PDF	http://arxiv.org/pdf/1811.01468v1.pdf
PWC	https://paperswithcode.com/paper/medical-code-prediction-with-multi-view
Repo
Framework

Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning


Title	Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning
Authors	Avinash Varadarajan, Pinal Bavishi, Paisan Raumviboonsuk, Peranut Chotcomwongse, Subhashini Venugopalan, Arunachalam Narayanaswamy, Jorge Cuadros, Kuniyoshi Kanai, George Bresnick, Mongkol Tadarati, Sukhum Silpa-archa, Jirawut Limwattanayingyong, Variya Nganthavee, Joe Ledsam, Pearse A Keane, Greg S Corrado, Lily Peng, Dale R Webster
Abstract	Diabetic eye disease is one of the fastest growing causes of preventable blindness. With the advent of anti-VEGF (vascular endothelial growth factor) therapies, it has become increasingly important to detect center-involved diabetic macular edema (ci-DME). However, center-involved diabetic macular edema is diagnosed using optical coherence tomography (OCT), which is not generally available at screening sites because of cost and workflow constraints. Instead, screening programs rely on the detection of hard exudates in color fundus photographs as a proxy for DME, often resulting in high false positive or false negative calls. To improve the accuracy of DME screening, we trained a deep learning model to use color fundus photographs to predict ci-DME. Our model had an ROC-AUC of 0.89 (95% CI: 0.87-0.91), which corresponds to a sensitivity of 85% at a specificity of 80%. In comparison, three retinal specialists had similar sensitivities (82-85%), but only half the specificity (45-50%, p<0.001 for each comparison with model). The positive predictive value (PPV) of the model was 61% (95% CI: 56-66%), approximately double the 36-38% by the retinal specialists. In addition to predicting ci-DME, our model was able to detect the presence of intraretinal fluid with an AUC of 0.81 (95% CI: 0.81-0.86) and subretinal fluid with an AUC of 0.88 (95% CI: 0.85-0.91). The ability of deep learning algorithms to make clinically relevant predictions that generally require sophisticated 3D-imaging equipment from simple 2D images has broad relevance to many other applications in medical imaging.
Tasks
Published	2018-10-18
URL	https://arxiv.org/abs/1810.10342v4
PDF	https://arxiv.org/pdf/1810.10342v4.pdf
PWC	https://paperswithcode.com/paper/predicting-optical-coherence-tomography
Repo
Framework

SCPNet: Spatial-Channel Parallelism Network for Joint Holistic and Partial Person Re-Identification


Title	SCPNet: Spatial-Channel Parallelism Network for Joint Holistic and Partial Person Re-Identification
Authors	Xing Fan, Hao Luo, Xuan Zhang, Lingxiao He, Chi Zhang, Wei Jiang
Abstract	Holistic person re-identification (ReID) has received extensive study in the past few years and achieves impressive progress. However, persons are often occluded by obstacles or other persons in practical scenarios, which makes partial person re-identification non-trivial. In this paper, we propose a spatial-channel parallelism network (SCPNet) in which each channel in the ReID feature pays attention to a given spatial part of the body. The spatial-channel corresponding relationship supervises the network to learn discriminative feature for both holistic and partial person re-identification. The single model trained on four holistic ReID datasets achieves competitive accuracy on these four datasets, as well as outperforms the state-of-the-art methods on two partial ReID datasets without training.
Tasks	Person Re-Identification
Published	2018-10-16
URL	http://arxiv.org/abs/1810.06996v1
PDF	http://arxiv.org/pdf/1810.06996v1.pdf
PWC	https://paperswithcode.com/paper/scpnet-spatial-channel-parallelism-network
Repo
Framework

Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding


Title	Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding
Authors	Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia
Abstract	Learning to estimate 3D geometry in a single image by watching unlabeled videos via deep convolutional network has made significant process recently. Current state-of-the-art (SOTA) methods, are based on the learning framework of rigid structure-from-motion, where only 3D camera ego motion is modeled for geometry estimation.However, moving objects also exist in many videos, e.g. moving cars in a street scene. In this paper, we tackle such motion by additionally incorporating per-pixel 3D object motion into the learning framework, which provides holistic 3D scene flow understanding and helps single image geometry estimation. Specifically, given two consecutive frames from a video, we adopt a motion network to predict their relative 3D camera pose and a segmentation mask distinguishing moving objects and rigid background. An optical flow network is used to estimate dense 2D per-pixel correspondence. A single image depth network predicts depth maps for both images. The four types of information, i.e. 2D flow, camera pose, segment mask and depth maps, are integrated into a differentiable holistic 3D motion parser (HMP), where per-pixel 3D motion for rigid background and moving objects are recovered. We design various losses w.r.t. the two types of 3D motions for training the depth and motion networks, yielding further error reduction for estimated geometry. Finally, in order to solve the 3D motion confusion from monocular videos, we combine stereo images into joint training. Experiments on KITTI 2015 dataset show that our estimated geometry, 3D motion and moving object masks, not only are constrained to be consistent, but also significantly outperforms other SOTA algorithms, demonstrating the benefits of our approach.
Tasks	Depth And Camera Motion, Depth Estimation, Optical Flow Estimation
Published	2018-06-27
URL	http://arxiv.org/abs/1806.10556v2
PDF	http://arxiv.org/pdf/1806.10556v2.pdf
PWC	https://paperswithcode.com/paper/every-pixel-counts-unsupervised-geometry
Repo
Framework

Context Exploitation using Hierarchical Bayesian Models


Title	Context Exploitation using Hierarchical Bayesian Models
Authors	Christopher A. George, Pranab Banerjee, Kendra E. Moore
Abstract	We consider the problem of how to improve automatic target recognition by fusing the naive sensor-level classification decisions with “intuition,” or context, in a mathematically principled way. This is a general approach that is compatible with many definitions of context, but for specificity, we consider context as co-occurrence in imagery. In particular, we consider images that contain multiple objects identified at various confidence levels. We learn the patterns of co-occurrence in each context, then use these patterns as hyper-parameters for a Hierarchical Bayesian Model. The result is that low-confidence sensor classification decisions can be dramatically improved by fusing those readings with context. We further use hyperpriors to address the case where multiple contexts may be appropriate. We also consider the Bayesian Network, an alternative to the Hierarchical Bayesian Model, which is computationally more efficient but assumes that context and sensor readings are uncorrelated.
Tasks
Published	2018-05-30
URL	http://arxiv.org/abs/1805.12183v1
PDF	http://arxiv.org/pdf/1805.12183v1.pdf
PWC	https://paperswithcode.com/paper/context-exploitation-using-hierarchical
Repo
Framework

On Cognitive Preferences and the Plausibility of Rule-based Models


Title	On Cognitive Preferences and the Plausibility of Rule-based Models
Authors	Johannes Fürnkranz, Tomáš Kliegr, Heiko Paulheim
Abstract	It is conventional wisdom in machine learning and data mining that logical models such as rule sets are more interpretable than other models, and that among such rule-based models, simpler models are more interpretable than more complex ones. In this position paper, we question this latter assumption by focusing on one particular aspect of interpretability, namely the plausibility of models. Roughly speaking, we equate the plausibility of a model with the likeliness that a user accepts it as an explanation for a prediction. In particular, we argue that, all other things being equal, longer explanations may be more convincing than shorter ones, and that the predominant bias for shorter models, which is typically necessary for learning powerful discriminative models, may not be suitable when it comes to user acceptance of the learned models. To that end, we first recapitulate evidence for and against this postulate, and then report the results of an evaluation in a crowd-sourcing study based on about 3.000 judgments. The results do not reveal a strong preference for simple rules, whereas we can observe a weak preference for longer rules in some domains. We then relate these results to well-known cognitive biases such as the conjunction fallacy, the representative heuristic, or the recogition heuristic, and investigate their relation to rule length and plausibility.
Tasks
Published	2018-03-04
URL	http://arxiv.org/abs/1803.01316v4
PDF	http://arxiv.org/pdf/1803.01316v4.pdf
PWC	https://paperswithcode.com/paper/on-cognitive-preferences-and-the-plausibility
Repo
Framework

Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling


Title	Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling
Authors	Feng Liu, Ruiming Tang, Xutao Li, Weinan Zhang, Yunming Ye, Haokun Chen, Huifeng Guo, Yuzhou Zhang
Abstract	Recommendation is crucial in both academia and industry, and various techniques are proposed such as content-based collaborative filtering, matrix factorization, logistic regression, factorization machines, neural networks and multi-armed bandits. However, most of the previous studies suffer from two limitations: (1) considering the recommendation as a static procedure and ignoring the dynamic interactive nature between users and the recommender systems, (2) focusing on the immediate feedback of recommended items and neglecting the long-term rewards. To address the two limitations, in this paper we propose a novel recommendation framework based on deep reinforcement learning, called DRR. The DRR framework treats recommendation as a sequential decision making procedure and adopts an “Actor-Critic” reinforcement learning scheme to model the interactions between the users and recommender systems, which can consider both the dynamic adaptation and long-term rewards. Furthermore, a state representation module is incorporated into DRR, which can explicitly capture the interactions between items and users. Three instantiation structures are developed. Extensive experiments on four real-world datasets are conducted under both the offline and online evaluation settings. The experimental results demonstrate the proposed DRR method indeed outperforms the state-of-the-art competitors.
Tasks	Decision Making, Multi-Armed Bandits, Recommendation Systems
Published	2018-10-29
URL	https://arxiv.org/abs/1810.12027v3
PDF	https://arxiv.org/pdf/1810.12027v3.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-based
Repo
Framework

Feature Assisted bi-directional LSTM Model for Protein-Protein Interaction Identification from Biomedical Texts


Title	Feature Assisted bi-directional LSTM Model for Protein-Protein Interaction Identification from Biomedical Texts
Authors	Shweta Yadav, Ankit Kumar, Asif Ekbal, Sriparna Saha, Pushpak Bhattacharyya
Abstract	Knowledge about protein-protein interactions is essential in understanding the biological processes such as metabolic pathways, DNA replication, and transcription etc. However, a majority of the existing Protein-Protein Interaction (PPI) systems are dependent primarily on the scientific literature, which is yet not accessible as a structured database. Thus, efficient information extraction systems are required for identifying PPI information from the large collection of biomedical texts. Most of the existing systems model the PPI extraction task as a classification problem and are tailored to the handcrafted feature set including domain dependent features. In this paper, we present a novel method based on deep bidirectional long short-term memory (B-LSTM) technique that exploits word sequences and dependency path related information to identify PPI information from text. This model leverages joint modeling of proteins and relations in a single unified framework, which we name as Shortest Dependency Path B-LSTM (sdpLSTM) model. We perform experiments on two popular benchmark PPI datasets, namely AiMed & BioInfer. The evaluation shows the F1-score values of 86.45% and 77.35% on AiMed and BioInfer, respectively. Comparisons with the existing systems show that our proposed approach attains state-of-the-art performance.
Tasks
Published	2018-07-05
URL	http://arxiv.org/abs/1807.02162v1
PDF	http://arxiv.org/pdf/1807.02162v1.pdf
PWC	https://paperswithcode.com/paper/feature-assisted-bi-directional-lstm-model
Repo
Framework

Reasoning in a Hierarchical System with Missing Group Size Information


Title	Reasoning in a Hierarchical System with Missing Group Size Information
Authors	Subhash Kak
Abstract	The paper analyzes the problem of judgments or preferences subsequent to initial analysis by autonomous agents in a hierarchical system where the higher level agents does not have access to group size information. We propose methods that reduce instances of preference reversal of the kind encountered in Simpson’s paradox.
Tasks
Published	2018-02-07
URL	http://arxiv.org/abs/1802.04093v1
PDF	http://arxiv.org/pdf/1802.04093v1.pdf
PWC	https://paperswithcode.com/paper/reasoning-in-a-hierarchical-system-with
Repo
Framework

Attention-based Neural Text Segmentation


Title	Attention-based Neural Text Segmentation
Authors	Pinkesh Badjatiya, Litton J Kurisinkel, Manish Gupta, Vasudeva Varma
Abstract	Text segmentation plays an important role in various Natural Language Processing (NLP) tasks like summarization, context understanding, document indexing and document noise removal. Previous methods for this task require manual feature engineering, huge memory requirements and large execution times. To the best of our knowledge, this paper is the first one to present a novel supervised neural approach for text segmentation. Specifically, we propose an attention-based bidirectional LSTM model where sentence embeddings are learned using CNNs and the segments are predicted based on contextual information. This model can automatically handle variable sized context information. Compared to the existing competitive baselines, the proposed model shows a performance improvement of ~7% in WinDiff score on three benchmark datasets.
Tasks	Feature Engineering, Sentence Embeddings
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09935v1
PDF	http://arxiv.org/pdf/1808.09935v1.pdf
PWC	https://paperswithcode.com/paper/attention-based-neural-text-segmentation
Repo
Framework

Neural Networks for Cross-lingual Negation Scope Detection


Title	Neural Networks for Cross-lingual Negation Scope Detection
Authors	Federico Fancellu, Adam Lopez, Bonnie Webber
Abstract	Negation scope has been annotated in several English and Chinese corpora, and highly accurate models for this task in these languages have been learned from these annotations. Unfortunately, annotations are not available in other languages. Could a model that detects negation scope be applied to a language that it hasn’t been trained on? We develop neural models that learn from cross-lingual word embeddings or universal dependencies in English, and test them on Chinese, showing that they work surprisingly well. We find that modelling syntax is helpful even in monolingual settings and that cross-lingual word embeddings help relatively little, and we analyse cases that are still difficult for this task.
Tasks	Word Embeddings
Published	2018-10-04
URL	http://arxiv.org/abs/1810.02156v1
PDF	http://arxiv.org/pdf/1810.02156v1.pdf
PWC	https://paperswithcode.com/paper/neural-networks-for-cross-lingual-negation
Repo
Framework

A Generalized Meta-loss function for regression and classification using privileged information


Title	A Generalized Meta-loss function for regression and classification using privileged information
Authors	Amina Asif, Muhammad Dawood, Fayyaz ul Amir Afsar Minhas
Abstract	Learning using privileged information (LUPI) is a powerful heterogenous feature space machine learning framework that allows a machine learning model to learn from highly informative or privileged features which are available during training only to generate test predictions using input space features which are available both during training and testing. LUPI can significantly improve prediction performance in a variety of machine learning problems. However, existing large margin and neural network implementations of learning using privileged information are mostly designed for classification tasks. In this work, we have proposed a simple yet effective formulation that allows us to perform regression using privileged information through a custom loss function. Apart from regression, our formulation allows general application of LUPI to classification and other related problems as well. We have verified the correctness, applicability and effectiveness of our method on regression and classification problems over different synthetic and real-world problems. To test the usefulness of the proposed model in real-world problems, we have evaluated our method on the problem of protein binding affinity prediction. The proposed LUPI regression-based model has shown to outperform the current state-of-the-art predictor.
Tasks
Published	2018-11-16
URL	http://arxiv.org/abs/1811.06885v2
PDF	http://arxiv.org/pdf/1811.06885v2.pdf
PWC	https://paperswithcode.com/paper/a-generalized-meta-loss-function-for
Repo
Framework

Improving Automatic Skin Lesion Segmentation using Adversarial Learning based Data Augmentation


Title	Improving Automatic Skin Lesion Segmentation using Adversarial Learning based Data Augmentation
Authors	Lei Bi, Dagan Feng, Jinman Kim
Abstract	Segmentation of skin lesions is considered as an important step in computer aided diagnosis (CAD) for automated melanoma diagnosis. In recent years, segmentation methods based on fully convolutional networks (FCN) have achieved great success in general images. This success is primarily due to the leveraging of large labelled datasets to learn features that correspond to the shallow appearance as well as the deep semantics of the images. However, the dependence on large dataset does not translate well into medical images. To improve the FCN performance for skin lesion segmentations, researchers attempted to use specific cost functions or add post-processing algorithms to refine the coarse boundaries of the FCN results. However, the performance of these methods is heavily reliant on the tuning of many parameters and post-processing techniques. In this paper, we leverage the state-of-the-art image feature learning method of generative adversarial network (GAN) for its inherent ability to produce consistent and realistic image features by using deep neural networks and adversarial learning concept. We improve upon GAN such that skin lesion features can be learned at different level of complexities, in a controlled manner. The outputs from our method is then augmented to the existing FCN training data, thus increasing the overall feature diversity. We evaluated our method on the ISIC 2018 skin lesion segmentation challenge dataset and showed that it was more accurate and robust when compared to the existing skin lesion segmentation methods.
Tasks	Data Augmentation, Lesion Segmentation
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08392v2
PDF	http://arxiv.org/pdf/1807.08392v2.pdf
PWC	https://paperswithcode.com/paper/improving-automatic-skin-lesion-segmentation
Repo
Framework

New approach for solar tracking systems based on computer vision, low cost hardware and deep learning


Title	New approach for solar tracking systems based on computer vision, low cost hardware and deep learning
Authors	Jose A. Carballo, Javier Bonilla, Manuel Berenguel, Jesús Fernández-Reche, Ginés García
Abstract	In this work, a new approach for Sun tracking systems is presented. Due to the current system limitations regarding costs and operational problems, a new approach based on low cost, computer vision open hardware and deep learning has been developed. The preliminary tests carried out successfully in Plataforma solar de Almeria (PSA), reveal the great potential and show the new approach as a good alternative to traditional systems. The proposed approach can provide key variables for the Sun tracking system control like cloud movements prediction, block and shadow detection, atmospheric attenuation or measures of concentrated solar radiation, which can improve the control strategies of the system and therefore the system performance.
Tasks	Shadow Detection
Published	2018-09-19
URL	http://arxiv.org/abs/1809.07048v1
PDF	http://arxiv.org/pdf/1809.07048v1.pdf
PWC	https://paperswithcode.com/paper/new-approach-for-solar-tracking-systems-based
Repo
Framework