Paper Group ANR 129
CIAN: Cross-Image Affinity Net for Weakly Supervised Semantic Segmentation. Medical code prediction with multi-view convolution and description-regularized label-dependent attention. Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning. SCPNet: Spatial-Channel Parallelism Network …
CIAN: Cross-Image Affinity Net for Weakly Supervised Semantic Segmentation
Title | CIAN: Cross-Image Affinity Net for Weakly Supervised Semantic Segmentation |
Authors | Junsong Fan, Zhaoxiang Zhang, Tieniu Tan, Chunfeng Song, Jun Xiao |
Abstract | Weakly supervised semantic segmentation with only image-level labels saves large human effort to annotate pixel-level labels. Cutting-edge approaches rely on various innovative constraints and heuristic rules to generate the masks for every single image. Although great progress has been achieved by these methods, they treat each image independently and do not take account of the relationships across different images. In this paper, however, we argue that the cross-image relationship is vital for weakly supervised segmentation. Because it connects related regions across images, where supplementary representations can be propagated to obtain more consistent and integral regions. To leverage this information, we propose an end-to-end cross-image affinity module, which exploits pixel-level cross-image relationships with only image-level labels. By means of this, our approach achieves 64.3% and 65.3% mIoU on Pascal VOC 2012 validation and test set respectively, which is a new state-of-the-art result by only using image-level labels for weakly supervised semantic segmentation, demonstrating the superiority of our approach. |
Tasks | Semantic Segmentation, Weakly-Supervised Semantic Segmentation |
Published | 2018-11-27 |
URL | https://arxiv.org/abs/1811.10842v2 |
https://arxiv.org/pdf/1811.10842v2.pdf | |
PWC | https://paperswithcode.com/paper/cian-cross-image-affinity-net-for-weakly |
Repo | |
Framework | |
Medical code prediction with multi-view convolution and description-regularized label-dependent attention
Title | Medical code prediction with multi-view convolution and description-regularized label-dependent attention |
Authors | Najmeh Sadoughi, Greg P. Finley, James Fone, Vignesh Murali, Maxim Korenevski, Slava Baryshnikov, Nico Axtmann, Mark Miller, David Suendermann-Oeft |
Abstract | A ubiquitous task in processing electronic medical data is the assignment of standardized codes representing diagnoses and/or procedures to free-text documents such as medical reports. This is a difficult natural language processing task that requires parsing long, heterogeneous documents and selecting a set of appropriate codes from tens of thousands of possibilities—many of which have very few positive training samples. We present a deep learning system that advances the state of the art for the MIMIC-III dataset, achieving a new best micro F1-measure of 55.85%, significantly outperforming the previous best result (Mullenbach et al. 2018). We achieve this through a number of enhancements, including two major novel contributions: multi-view convolutional channels, which effectively learn to adjust kernel sizes throughout the input; and attention regularization, mediated by natural-language code descriptions, which helps overcome sparsity for thousands of uncommon codes. These and other modifications are selected to address difficulties inherent to both automated coding specifically and deep learning generally. Finally, we investigate our accuracy results in detail to individually measure the impact of these contributions and point the way towards future algorithmic improvements. |
Tasks | Medical Code Prediction |
Published | 2018-11-05 |
URL | http://arxiv.org/abs/1811.01468v1 |
http://arxiv.org/pdf/1811.01468v1.pdf | |
PWC | https://paperswithcode.com/paper/medical-code-prediction-with-multi-view |
Repo | |
Framework | |
Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning
Title | Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning |
Authors | Avinash Varadarajan, Pinal Bavishi, Paisan Raumviboonsuk, Peranut Chotcomwongse, Subhashini Venugopalan, Arunachalam Narayanaswamy, Jorge Cuadros, Kuniyoshi Kanai, George Bresnick, Mongkol Tadarati, Sukhum Silpa-archa, Jirawut Limwattanayingyong, Variya Nganthavee, Joe Ledsam, Pearse A Keane, Greg S Corrado, Lily Peng, Dale R Webster |
Abstract | Diabetic eye disease is one of the fastest growing causes of preventable blindness. With the advent of anti-VEGF (vascular endothelial growth factor) therapies, it has become increasingly important to detect center-involved diabetic macular edema (ci-DME). However, center-involved diabetic macular edema is diagnosed using optical coherence tomography (OCT), which is not generally available at screening sites because of cost and workflow constraints. Instead, screening programs rely on the detection of hard exudates in color fundus photographs as a proxy for DME, often resulting in high false positive or false negative calls. To improve the accuracy of DME screening, we trained a deep learning model to use color fundus photographs to predict ci-DME. Our model had an ROC-AUC of 0.89 (95% CI: 0.87-0.91), which corresponds to a sensitivity of 85% at a specificity of 80%. In comparison, three retinal specialists had similar sensitivities (82-85%), but only half the specificity (45-50%, p<0.001 for each comparison with model). The positive predictive value (PPV) of the model was 61% (95% CI: 56-66%), approximately double the 36-38% by the retinal specialists. In addition to predicting ci-DME, our model was able to detect the presence of intraretinal fluid with an AUC of 0.81 (95% CI: 0.81-0.86) and subretinal fluid with an AUC of 0.88 (95% CI: 0.85-0.91). The ability of deep learning algorithms to make clinically relevant predictions that generally require sophisticated 3D-imaging equipment from simple 2D images has broad relevance to many other applications in medical imaging. |
Tasks | |
Published | 2018-10-18 |
URL | https://arxiv.org/abs/1810.10342v4 |
https://arxiv.org/pdf/1810.10342v4.pdf | |
PWC | https://paperswithcode.com/paper/predicting-optical-coherence-tomography |
Repo | |
Framework | |
SCPNet: Spatial-Channel Parallelism Network for Joint Holistic and Partial Person Re-Identification
Title | SCPNet: Spatial-Channel Parallelism Network for Joint Holistic and Partial Person Re-Identification |
Authors | Xing Fan, Hao Luo, Xuan Zhang, Lingxiao He, Chi Zhang, Wei Jiang |
Abstract | Holistic person re-identification (ReID) has received extensive study in the past few years and achieves impressive progress. However, persons are often occluded by obstacles or other persons in practical scenarios, which makes partial person re-identification non-trivial. In this paper, we propose a spatial-channel parallelism network (SCPNet) in which each channel in the ReID feature pays attention to a given spatial part of the body. The spatial-channel corresponding relationship supervises the network to learn discriminative feature for both holistic and partial person re-identification. The single model trained on four holistic ReID datasets achieves competitive accuracy on these four datasets, as well as outperforms the state-of-the-art methods on two partial ReID datasets without training. |
Tasks | Person Re-Identification |
Published | 2018-10-16 |
URL | http://arxiv.org/abs/1810.06996v1 |
http://arxiv.org/pdf/1810.06996v1.pdf | |
PWC | https://paperswithcode.com/paper/scpnet-spatial-channel-parallelism-network |
Repo | |
Framework | |
Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding
Title | Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding |
Authors | Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia |
Abstract | Learning to estimate 3D geometry in a single image by watching unlabeled videos via deep convolutional network has made significant process recently. Current state-of-the-art (SOTA) methods, are based on the learning framework of rigid structure-from-motion, where only 3D camera ego motion is modeled for geometry estimation.However, moving objects also exist in many videos, e.g. moving cars in a street scene. In this paper, we tackle such motion by additionally incorporating per-pixel 3D object motion into the learning framework, which provides holistic 3D scene flow understanding and helps single image geometry estimation. Specifically, given two consecutive frames from a video, we adopt a motion network to predict their relative 3D camera pose and a segmentation mask distinguishing moving objects and rigid background. An optical flow network is used to estimate dense 2D per-pixel correspondence. A single image depth network predicts depth maps for both images. The four types of information, i.e. 2D flow, camera pose, segment mask and depth maps, are integrated into a differentiable holistic 3D motion parser (HMP), where per-pixel 3D motion for rigid background and moving objects are recovered. We design various losses w.r.t. the two types of 3D motions for training the depth and motion networks, yielding further error reduction for estimated geometry. Finally, in order to solve the 3D motion confusion from monocular videos, we combine stereo images into joint training. Experiments on KITTI 2015 dataset show that our estimated geometry, 3D motion and moving object masks, not only are constrained to be consistent, but also significantly outperforms other SOTA algorithms, demonstrating the benefits of our approach. |
Tasks | Depth And Camera Motion, Depth Estimation, Optical Flow Estimation |
Published | 2018-06-27 |
URL | http://arxiv.org/abs/1806.10556v2 |
http://arxiv.org/pdf/1806.10556v2.pdf | |
PWC | https://paperswithcode.com/paper/every-pixel-counts-unsupervised-geometry |
Repo | |
Framework | |
Context Exploitation using Hierarchical Bayesian Models
Title | Context Exploitation using Hierarchical Bayesian Models |
Authors | Christopher A. George, Pranab Banerjee, Kendra E. Moore |
Abstract | We consider the problem of how to improve automatic target recognition by fusing the naive sensor-level classification decisions with “intuition,” or context, in a mathematically principled way. This is a general approach that is compatible with many definitions of context, but for specificity, we consider context as co-occurrence in imagery. In particular, we consider images that contain multiple objects identified at various confidence levels. We learn the patterns of co-occurrence in each context, then use these patterns as hyper-parameters for a Hierarchical Bayesian Model. The result is that low-confidence sensor classification decisions can be dramatically improved by fusing those readings with context. We further use hyperpriors to address the case where multiple contexts may be appropriate. We also consider the Bayesian Network, an alternative to the Hierarchical Bayesian Model, which is computationally more efficient but assumes that context and sensor readings are uncorrelated. |
Tasks | |
Published | 2018-05-30 |
URL | http://arxiv.org/abs/1805.12183v1 |
http://arxiv.org/pdf/1805.12183v1.pdf | |
PWC | https://paperswithcode.com/paper/context-exploitation-using-hierarchical |
Repo | |
Framework | |
On Cognitive Preferences and the Plausibility of Rule-based Models
Title | On Cognitive Preferences and the Plausibility of Rule-based Models |
Authors | Johannes Fürnkranz, Tomáš Kliegr, Heiko Paulheim |
Abstract | It is conventional wisdom in machine learning and data mining that logical models such as rule sets are more interpretable than other models, and that among such rule-based models, simpler models are more interpretable than more complex ones. In this position paper, we question this latter assumption by focusing on one particular aspect of interpretability, namely the plausibility of models. Roughly speaking, we equate the plausibility of a model with the likeliness that a user accepts it as an explanation for a prediction. In particular, we argue that, all other things being equal, longer explanations may be more convincing than shorter ones, and that the predominant bias for shorter models, which is typically necessary for learning powerful discriminative models, may not be suitable when it comes to user acceptance of the learned models. To that end, we first recapitulate evidence for and against this postulate, and then report the results of an evaluation in a crowd-sourcing study based on about 3.000 judgments. The results do not reveal a strong preference for simple rules, whereas we can observe a weak preference for longer rules in some domains. We then relate these results to well-known cognitive biases such as the conjunction fallacy, the representative heuristic, or the recogition heuristic, and investigate their relation to rule length and plausibility. |
Tasks | |
Published | 2018-03-04 |
URL | http://arxiv.org/abs/1803.01316v4 |
http://arxiv.org/pdf/1803.01316v4.pdf | |
PWC | https://paperswithcode.com/paper/on-cognitive-preferences-and-the-plausibility |
Repo | |
Framework | |
Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling
Title | Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling |
Authors | Feng Liu, Ruiming Tang, Xutao Li, Weinan Zhang, Yunming Ye, Haokun Chen, Huifeng Guo, Yuzhou Zhang |
Abstract | Recommendation is crucial in both academia and industry, and various techniques are proposed such as content-based collaborative filtering, matrix factorization, logistic regression, factorization machines, neural networks and multi-armed bandits. However, most of the previous studies suffer from two limitations: (1) considering the recommendation as a static procedure and ignoring the dynamic interactive nature between users and the recommender systems, (2) focusing on the immediate feedback of recommended items and neglecting the long-term rewards. To address the two limitations, in this paper we propose a novel recommendation framework based on deep reinforcement learning, called DRR. The DRR framework treats recommendation as a sequential decision making procedure and adopts an “Actor-Critic” reinforcement learning scheme to model the interactions between the users and recommender systems, which can consider both the dynamic adaptation and long-term rewards. Furthermore, a state representation module is incorporated into DRR, which can explicitly capture the interactions between items and users. Three instantiation structures are developed. Extensive experiments on four real-world datasets are conducted under both the offline and online evaluation settings. The experimental results demonstrate the proposed DRR method indeed outperforms the state-of-the-art competitors. |
Tasks | Decision Making, Multi-Armed Bandits, Recommendation Systems |
Published | 2018-10-29 |
URL | https://arxiv.org/abs/1810.12027v3 |
https://arxiv.org/pdf/1810.12027v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-based |
Repo | |
Framework | |
Feature Assisted bi-directional LSTM Model for Protein-Protein Interaction Identification from Biomedical Texts
Title | Feature Assisted bi-directional LSTM Model for Protein-Protein Interaction Identification from Biomedical Texts |
Authors | Shweta Yadav, Ankit Kumar, Asif Ekbal, Sriparna Saha, Pushpak Bhattacharyya |
Abstract | Knowledge about protein-protein interactions is essential in understanding the biological processes such as metabolic pathways, DNA replication, and transcription etc. However, a majority of the existing Protein-Protein Interaction (PPI) systems are dependent primarily on the scientific literature, which is yet not accessible as a structured database. Thus, efficient information extraction systems are required for identifying PPI information from the large collection of biomedical texts. Most of the existing systems model the PPI extraction task as a classification problem and are tailored to the handcrafted feature set including domain dependent features. In this paper, we present a novel method based on deep bidirectional long short-term memory (B-LSTM) technique that exploits word sequences and dependency path related information to identify PPI information from text. This model leverages joint modeling of proteins and relations in a single unified framework, which we name as Shortest Dependency Path B-LSTM (sdpLSTM) model. We perform experiments on two popular benchmark PPI datasets, namely AiMed & BioInfer. The evaluation shows the F1-score values of 86.45% and 77.35% on AiMed and BioInfer, respectively. Comparisons with the existing systems show that our proposed approach attains state-of-the-art performance. |
Tasks | |
Published | 2018-07-05 |
URL | http://arxiv.org/abs/1807.02162v1 |
http://arxiv.org/pdf/1807.02162v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-assisted-bi-directional-lstm-model |
Repo | |
Framework | |
Reasoning in a Hierarchical System with Missing Group Size Information
Title | Reasoning in a Hierarchical System with Missing Group Size Information |
Authors | Subhash Kak |
Abstract | The paper analyzes the problem of judgments or preferences subsequent to initial analysis by autonomous agents in a hierarchical system where the higher level agents does not have access to group size information. We propose methods that reduce instances of preference reversal of the kind encountered in Simpson’s paradox. |
Tasks | |
Published | 2018-02-07 |
URL | http://arxiv.org/abs/1802.04093v1 |
http://arxiv.org/pdf/1802.04093v1.pdf | |
PWC | https://paperswithcode.com/paper/reasoning-in-a-hierarchical-system-with |
Repo | |
Framework | |
Attention-based Neural Text Segmentation
Title | Attention-based Neural Text Segmentation |
Authors | Pinkesh Badjatiya, Litton J Kurisinkel, Manish Gupta, Vasudeva Varma |
Abstract | Text segmentation plays an important role in various Natural Language Processing (NLP) tasks like summarization, context understanding, document indexing and document noise removal. Previous methods for this task require manual feature engineering, huge memory requirements and large execution times. To the best of our knowledge, this paper is the first one to present a novel supervised neural approach for text segmentation. Specifically, we propose an attention-based bidirectional LSTM model where sentence embeddings are learned using CNNs and the segments are predicted based on contextual information. This model can automatically handle variable sized context information. Compared to the existing competitive baselines, the proposed model shows a performance improvement of ~7% in WinDiff score on three benchmark datasets. |
Tasks | Feature Engineering, Sentence Embeddings |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.09935v1 |
http://arxiv.org/pdf/1808.09935v1.pdf | |
PWC | https://paperswithcode.com/paper/attention-based-neural-text-segmentation |
Repo | |
Framework | |
Neural Networks for Cross-lingual Negation Scope Detection
Title | Neural Networks for Cross-lingual Negation Scope Detection |
Authors | Federico Fancellu, Adam Lopez, Bonnie Webber |
Abstract | Negation scope has been annotated in several English and Chinese corpora, and highly accurate models for this task in these languages have been learned from these annotations. Unfortunately, annotations are not available in other languages. Could a model that detects negation scope be applied to a language that it hasn’t been trained on? We develop neural models that learn from cross-lingual word embeddings or universal dependencies in English, and test them on Chinese, showing that they work surprisingly well. We find that modelling syntax is helpful even in monolingual settings and that cross-lingual word embeddings help relatively little, and we analyse cases that are still difficult for this task. |
Tasks | Word Embeddings |
Published | 2018-10-04 |
URL | http://arxiv.org/abs/1810.02156v1 |
http://arxiv.org/pdf/1810.02156v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-networks-for-cross-lingual-negation |
Repo | |
Framework | |
A Generalized Meta-loss function for regression and classification using privileged information
Title | A Generalized Meta-loss function for regression and classification using privileged information |
Authors | Amina Asif, Muhammad Dawood, Fayyaz ul Amir Afsar Minhas |
Abstract | Learning using privileged information (LUPI) is a powerful heterogenous feature space machine learning framework that allows a machine learning model to learn from highly informative or privileged features which are available during training only to generate test predictions using input space features which are available both during training and testing. LUPI can significantly improve prediction performance in a variety of machine learning problems. However, existing large margin and neural network implementations of learning using privileged information are mostly designed for classification tasks. In this work, we have proposed a simple yet effective formulation that allows us to perform regression using privileged information through a custom loss function. Apart from regression, our formulation allows general application of LUPI to classification and other related problems as well. We have verified the correctness, applicability and effectiveness of our method on regression and classification problems over different synthetic and real-world problems. To test the usefulness of the proposed model in real-world problems, we have evaluated our method on the problem of protein binding affinity prediction. The proposed LUPI regression-based model has shown to outperform the current state-of-the-art predictor. |
Tasks | |
Published | 2018-11-16 |
URL | http://arxiv.org/abs/1811.06885v2 |
http://arxiv.org/pdf/1811.06885v2.pdf | |
PWC | https://paperswithcode.com/paper/a-generalized-meta-loss-function-for |
Repo | |
Framework | |
Improving Automatic Skin Lesion Segmentation using Adversarial Learning based Data Augmentation
Title | Improving Automatic Skin Lesion Segmentation using Adversarial Learning based Data Augmentation |
Authors | Lei Bi, Dagan Feng, Jinman Kim |
Abstract | Segmentation of skin lesions is considered as an important step in computer aided diagnosis (CAD) for automated melanoma diagnosis. In recent years, segmentation methods based on fully convolutional networks (FCN) have achieved great success in general images. This success is primarily due to the leveraging of large labelled datasets to learn features that correspond to the shallow appearance as well as the deep semantics of the images. However, the dependence on large dataset does not translate well into medical images. To improve the FCN performance for skin lesion segmentations, researchers attempted to use specific cost functions or add post-processing algorithms to refine the coarse boundaries of the FCN results. However, the performance of these methods is heavily reliant on the tuning of many parameters and post-processing techniques. In this paper, we leverage the state-of-the-art image feature learning method of generative adversarial network (GAN) for its inherent ability to produce consistent and realistic image features by using deep neural networks and adversarial learning concept. We improve upon GAN such that skin lesion features can be learned at different level of complexities, in a controlled manner. The outputs from our method is then augmented to the existing FCN training data, thus increasing the overall feature diversity. We evaluated our method on the ISIC 2018 skin lesion segmentation challenge dataset and showed that it was more accurate and robust when compared to the existing skin lesion segmentation methods. |
Tasks | Data Augmentation, Lesion Segmentation |
Published | 2018-07-23 |
URL | http://arxiv.org/abs/1807.08392v2 |
http://arxiv.org/pdf/1807.08392v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-automatic-skin-lesion-segmentation |
Repo | |
Framework | |
New approach for solar tracking systems based on computer vision, low cost hardware and deep learning
Title | New approach for solar tracking systems based on computer vision, low cost hardware and deep learning |
Authors | Jose A. Carballo, Javier Bonilla, Manuel Berenguel, Jesús Fernández-Reche, Ginés García |
Abstract | In this work, a new approach for Sun tracking systems is presented. Due to the current system limitations regarding costs and operational problems, a new approach based on low cost, computer vision open hardware and deep learning has been developed. The preliminary tests carried out successfully in Plataforma solar de Almeria (PSA), reveal the great potential and show the new approach as a good alternative to traditional systems. The proposed approach can provide key variables for the Sun tracking system control like cloud movements prediction, block and shadow detection, atmospheric attenuation or measures of concentrated solar radiation, which can improve the control strategies of the system and therefore the system performance. |
Tasks | Shadow Detection |
Published | 2018-09-19 |
URL | http://arxiv.org/abs/1809.07048v1 |
http://arxiv.org/pdf/1809.07048v1.pdf | |
PWC | https://paperswithcode.com/paper/new-approach-for-solar-tracking-systems-based |
Repo | |
Framework | |