Paper Group AWR 290
Neural Ordinary Differential Equations. Scalable and accurate deep learning for electronic health records. Deep Vessel Segmentation By Learning Graphical Connectivity. Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond. Extracting Sentiment Attitudes From Analytical Texts. Attend, Copy, Parse – End-to-end in …
Neural Ordinary Differential Equations
Title | Neural Ordinary Differential Equations |
Authors | Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud |
Abstract | We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models. |
Tasks | Latent Variable Models, Multivariate Time Series Forecasting, Multivariate Time Series Imputation |
Published | 2018-06-19 |
URL | https://arxiv.org/abs/1806.07366v5 |
https://arxiv.org/pdf/1806.07366v5.pdf | |
PWC | https://paperswithcode.com/paper/neural-ordinary-differential-equations |
Repo | https://github.com/esghif/torchdiffeq |
Framework | pytorch |
Scalable and accurate deep learning for electronic health records
Title | Scalable and accurate deep learning for electronic health records |
Authors | Alvin Rajkomar, Eyal Oren, Kai Chen, Andrew M. Dai, Nissan Hajaj, Peter J. Liu, Xiaobing Liu, Mimi Sun, Patrik Sundberg, Hector Yee, Kun Zhang, Gavin E. Duggan, Gerardo Flores, Michaela Hardt, Jamie Irvine, Quoc Le, Kurt Litsch, Jake Marcus, Alexander Mossin, Justin Tansuwan, De Wang, James Wexler, Jimbo Wilson, Dana Ludwig, Samuel L. Volchenboum, Katherine Chou, Michael Pearson, Srinivasan Madabushi, Nigam H. Shah, Atul J. Butte, Michael Howell, Claire Cui, Greg Corrado, Jeff Dean |
Abstract | Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient’s record. We propose a representation of patients’ entire, raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two U.S. academic medical centers with 216,221 adult patients hospitalized for at least 24 hours. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting in-hospital mortality (AUROC across sites 0.93-0.94), 30-day unplanned readmission (AUROC 0.75-0.76), prolonged length of stay (AUROC 0.85-0.86), and all of a patient’s final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed state-of-the-art traditional predictive models in all cases. We also present a case-study of a neural-network attribution system, which illustrates how clinicians can gain some transparency into the predictions. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios, complete with explanations that directly highlight evidence in the patient’s chart. |
Tasks | |
Published | 2018-01-24 |
URL | http://arxiv.org/abs/1801.07860v3 |
http://arxiv.org/pdf/1801.07860v3.pdf | |
PWC | https://paperswithcode.com/paper/scalable-and-accurate-deep-learning-for |
Repo | https://github.com/nwams/Predicting-Hospital-Readmission-using-NLP |
Framework | none |
Deep Vessel Segmentation By Learning Graphical Connectivity
Title | Deep Vessel Segmentation By Learning Graphical Connectivity |
Authors | Seung Yeon Shin, Soochahn Lee, Il Dong Yun, Kyoung Mu Lee |
Abstract | We propose a novel deep-learning-based system for vessel segmentation. Existing methods using CNNs have mostly relied on local appearances learned on the regular image grid, without considering the graphical structure of vessel shape. To address this, we incorporate a graph convolutional network into a unified CNN architecture, where the final segmentation is inferred by combining the different types of features. The proposed method can be applied to expand any type of CNN-based vessel segmentation method to enhance the performance. Experiments show that the proposed method outperforms the current state-of-the-art methods on two retinal image datasets as well as a coronary artery X-ray angiography dataset. |
Tasks | Retinal Vessel Segmentation |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02279v1 |
http://arxiv.org/pdf/1806.02279v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-vessel-segmentation-by-learning |
Repo | https://github.com/syshin1014/VGN |
Framework | tf |
Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond
Title | Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond |
Authors | Krishna Kumar Singh, Hao Yu, Aron Sarmasi, Gautam Pradeep, Yong Jae Lee |
Abstract | We propose ‘Hide-and-Seek’ a general purpose data augmentation technique, which is complementary to existing data augmentation techniques and is beneficial for various visual recognition tasks. The key idea is to hide patches in a training image randomly, in order to force the network to seek other relevant content when the most discriminative content is hidden. Our approach only needs to modify the input image and can work with any network to improve its performance. During testing, it does not need to hide any patches. The main advantage of Hide-and-Seek over existing data augmentation techniques is its ability to improve object localization accuracy in the weakly-supervised setting, and we therefore use this task to motivate the approach. However, Hide-and-Seek is not tied only to the image localization task, and can generalize to other forms of visual input like videos, as well as other recognition tasks like image classification, temporal action localization, semantic segmentation, emotion recognition, age/gender estimation, and person re-identification. We perform extensive experiments to showcase the advantage of Hide-and-Seek on these various visual recognition problems. |
Tasks | Action Localization, Data Augmentation, Emotion Recognition, Image Classification, Object Localization, Person Re-Identification, Semantic Segmentation, Temporal Action Localization |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02545v1 |
http://arxiv.org/pdf/1811.02545v1.pdf | |
PWC | https://paperswithcode.com/paper/hide-and-seek-a-data-augmentation-technique |
Repo | https://github.com/kkanshul/Hide-and-Seek |
Framework | pytorch |
Extracting Sentiment Attitudes From Analytical Texts
Title | Extracting Sentiment Attitudes From Analytical Texts |
Authors | Natalia Loukachevitch, Nicolay Rusnachenko |
Abstract | In this paper we present the RuSentRel corpus including analytical texts in the sphere of international relations. For each document we annotated sentiments from the author to mentioned named entities, and sentiments of relations between mentioned entities. In the current experiments, we considered the problem of extracting sentiment relations between entities for the whole documents as a three-class machine learning task. We experimented with conventional machine-learning methods (Naive Bayes, SVM, Random Forest). |
Tasks | |
Published | 2018-08-27 |
URL | http://arxiv.org/abs/1808.08932v1 |
http://arxiv.org/pdf/1808.08932v1.pdf | |
PWC | https://paperswithcode.com/paper/extracting-sentiment-attitudes-from |
Repo | https://github.com/nicolay-r/RuSentRel |
Framework | none |
Attend, Copy, Parse – End-to-end information extraction from documents
Title | Attend, Copy, Parse – End-to-end information extraction from documents |
Authors | Rasmus Berg Palm, Florian Laws, Ole Winther |
Abstract | Document information extraction tasks performed by humans create data consisting of a PDF or document image input, and extracted string outputs. This end-to-end data is naturally consumed and produced when performing the task because it is valuable in and of itself. It is naturally available, at no additional cost. Unfortunately, state-of-the-art word classification methods for information extraction cannot use this data, instead requiring word-level labels which are expensive to create and consequently not available for many real life tasks. In this paper we propose the Attend, Copy, Parse architecture, a deep neural network model that can be trained directly on end-to-end data, bypassing the need for word-level labels. We evaluate the proposed architecture on a large diverse set of invoices, and outperform a state-of-the-art production system based on word classification. We believe our proposed architecture can be used on many real life information extraction tasks where word classification cannot be used due to a lack of the required word-level labels. |
Tasks | |
Published | 2018-12-18 |
URL | https://arxiv.org/abs/1812.07248v2 |
https://arxiv.org/pdf/1812.07248v2.pdf | |
PWC | https://paperswithcode.com/paper/attend-copy-parse-end-to-end-information |
Repo | https://github.com/Tradeshift/attend-copy-parse |
Framework | tf |
PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image
Title | PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image |
Authors | Chen Liu, Jimei Yang, Duygu Ceylan, Ersin Yumer, Yasutaka Furukawa |
Abstract | This paper proposes a deep neural network (DNN) for piece-wise planar depthmap reconstruction from a single RGB image. While DNNs have brought remarkable progress to single-image depth prediction, piece-wise planar depthmap reconstruction requires a structured geometry representation, and has been a difficult task to master even for DNNs. The proposed end-to-end DNN learns to directly infer a set of plane parameters and corresponding plane segmentation masks from a single RGB image. We have generated more than 50,000 piece-wise planar depthmaps for training and testing from ScanNet, a large-scale RGBD video database. Our qualitative and quantitative evaluations demonstrate that the proposed approach outperforms baseline methods in terms of both plane segmentation and depth estimation accuracy. To the best of our knowledge, this paper presents the first end-to-end neural architecture for piece-wise planar reconstruction from a single RGB image. Code and data are available at https://github.com/art-programmer/PlaneNet. |
Tasks | Depth Estimation |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06278v1 |
http://arxiv.org/pdf/1804.06278v1.pdf | |
PWC | https://paperswithcode.com/paper/planenet-piece-wise-planar-reconstruction |
Repo | https://github.com/art-programmer/PlaneNet |
Framework | tf |
Pixel-wise Attentional Gating for Parsimonious Pixel Labeling
Title | Pixel-wise Attentional Gating for Parsimonious Pixel Labeling |
Authors | Shu Kong, Charless Fowlkes |
Abstract | To achieve parsimonious inference in per-pixel labeling tasks with a limited computational budget, we propose a \emph{Pixel-wise Attentional Gating} unit (\emph{PAG}) that learns to selectively process a subset of spatial locations at each layer of a deep convolutional network. PAG is a generic, architecture-independent, problem-agnostic mechanism that can be readily “plugged in” to an existing model with fine-tuning. We utilize PAG in two ways: 1) learning spatially varying pooling fields that improve model performance without the extra computation cost associated with multi-scale pooling, and 2) learning a dynamic computation policy for each pixel to decrease total computation while maintaining accuracy. We extensively evaluate PAG on a variety of per-pixel labeling tasks, including semantic segmentation, boundary detection, monocular depth and surface normal estimation. We demonstrate that PAG allows competitive or state-of-the-art performance on these tasks. Our experiments show that PAG learns dynamic spatial allocation of computation over the input image which provides better performance trade-offs compared to related approaches (e.g., truncating deep models or dynamically skipping whole layers). Generally, we observe PAG can reduce computation by $10%$ without noticeable loss in accuracy and performance degrades gracefully when imposing stronger computational constraints. |
Tasks | Boundary Detection, Semantic Segmentation |
Published | 2018-05-03 |
URL | http://arxiv.org/abs/1805.01556v2 |
http://arxiv.org/pdf/1805.01556v2.pdf | |
PWC | https://paperswithcode.com/paper/pixel-wise-attentional-gating-for |
Repo | https://github.com/aimerykong/Pixel-Attentional-Gating |
Framework | none |
Self-supervised Learning of Dense Shape Correspondence
Title | Self-supervised Learning of Dense Shape Correspondence |
Authors | Oshri Halimi, Or Litany, Emanuele Rodolà, Alex Bronstein, Ron Kimmel |
Abstract | We introduce the first completely unsupervised correspondence learning approach for deformable 3D shapes. Key to our model is the understanding that natural deformations (such as changes in pose) approximately preserve the metric structure of the surface, yielding a natural criterion to drive the learning process toward distortion-minimizing predictions. On this basis, we overcome the need for annotated data and replace it by a purely geometric criterion. The resulting learning model is class-agnostic, and is able to leverage any type of deformable geometric data for the training phase. In contrast to existing supervised approaches which specialize on the class seen at training time, we demonstrate stronger generalization as well as applicability to a variety of challenging settings. We showcase our method on a wide selection of correspondence benchmarks, where we outperform other methods in terms of accuracy, generalization, and efficiency. |
Tasks | |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02415v1 |
http://arxiv.org/pdf/1812.02415v1.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-learning-of-dense-shape |
Repo | https://github.com/OshriHalimi/unsupervised_learning_of_dense_shape_correspondence |
Framework | none |
Abduction-Based Explanations for Machine Learning Models
Title | Abduction-Based Explanations for Machine Learning Models |
Authors | Alexey Ignatiev, Nina Narodytska, Joao Marques-Silva |
Abstract | The growing range of applications of Machine Learning (ML) in a multitude of settings motivates the ability of computing small explanations for predictions made. Small explanations are generally accepted as easier for human decision makers to understand. Most earlier work on computing explanations is based on heuristic approaches, providing no guarantees of quality, in terms of how close such solutions are from cardinality- or subset-minimal explanations. This paper develops a constraint-agnostic solution for computing explanations for any ML model. The proposed solution exploits abductive reasoning, and imposes the requirement that the ML model can be represented as sets of constraints using some target constraint reasoning system for which the decision problem can be answered with some oracle. The experimental results, obtained on well-known datasets, validate the scalability of the proposed approach as well as the quality of the computed solutions. |
Tasks | |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1811.10656v1 |
http://arxiv.org/pdf/1811.10656v1.pdf | |
PWC | https://paperswithcode.com/paper/abduction-based-explanations-for-machine |
Repo | https://github.com/alexeyignatiev/xplainer |
Framework | none |
An improved neural network model for joint POS tagging and dependency parsing
Title | An improved neural network model for joint POS tagging and dependency parsing |
Authors | Dat Quoc Nguyen, Karin Verspoor |
Abstract | We propose a novel neural network model for joint part-of-speech (POS) tagging and dependency parsing. Our model extends the well-known BIST graph-based dependency parser (Kiperwasser and Goldberg, 2016) by incorporating a BiLSTM-based tagging component to produce automatically predicted POS tags for the parser. On the benchmark English Penn treebank, our model obtains strong UAS and LAS scores at 94.51% and 92.87%, respectively, producing 1.5+% absolute improvements to the BIST graph-based parser, and also obtaining a state-of-the-art POS tagging accuracy at 97.97%. Furthermore, experimental results on parsing 61 “big” Universal Dependencies treebanks from raw texts show that our model outperforms the baseline UDPipe (Straka and Strakov'a, 2017) with 0.8% higher average POS tagging score and 3.6% higher average LAS score. In addition, with our model, we also obtain state-of-the-art downstream task scores for biomedical event extraction and opinion analysis applications. Our code is available together with all pre-trained models at: https://github.com/datquocnguyen/jPTDP |
Tasks | Dependency Parsing, Part-Of-Speech Tagging |
Published | 2018-07-11 |
URL | http://arxiv.org/abs/1807.03955v2 |
http://arxiv.org/pdf/1807.03955v2.pdf | |
PWC | https://paperswithcode.com/paper/an-improved-neural-network-model-for-joint |
Repo | https://github.com/datquocnguyen/jPTDP |
Framework | none |
Style Augmentation: Data Augmentation via Style Randomization
Title | Style Augmentation: Data Augmentation via Style Randomization |
Authors | Philip T. Jackson, Amir Atapour-Abarghouei, Stephen Bonner, Toby Breckon, Boguslaw Obara |
Abstract | We introduce style augmentation, a new form of data augmentation based on random style transfer, for improving the robustness of convolutional neural networks (CNN) over both classification and regression based tasks. During training, our style augmentation randomizes texture, contrast and color, while preserving shape and semantic content. This is accomplished by adapting an arbitrary style transfer network to perform style randomization, by sampling input style embeddings from a multivariate normal distribution instead of inferring them from a style image. In addition to standard classification experiments, we investigate the effect of style augmentation (and data augmentation generally) on domain transfer tasks. We find that data augmentation significantly improves robustness to domain shift, and can be used as a simple, domain agnostic alternative to domain adaptation. Comparing style augmentation against a mix of seven traditional augmentation techniques, we find that it can be readily combined with them to improve network performance. We validate the efficacy of our technique with domain transfer experiments in classification and monocular depth estimation, illustrating consistent improvements in generalization. |
Tasks | Data Augmentation, Depth Estimation, Domain Adaptation, Monocular Depth Estimation, Style Transfer |
Published | 2018-09-14 |
URL | http://arxiv.org/abs/1809.05375v2 |
http://arxiv.org/pdf/1809.05375v2.pdf | |
PWC | https://paperswithcode.com/paper/style-augmentation-data-augmentation-via |
Repo | https://github.com/philipjackson/style-augmentation |
Framework | pytorch |
Pose2Seg: Detection Free Human Instance Segmentation
Title | Pose2Seg: Detection Free Human Instance Segmentation |
Authors | Song-Hai Zhang, Ruilong Li, Xin Dong, Paul L. Rosin, Zixi Cai, Han Xi, Dingcheng Yang, Hao-Zhi Huang, Shi-Min Hu |
Abstract | The standard approach to image instance segmentation is to perform the object detection first, and then segment the object from the detection bounding-box. More recently, deep learning methods like Mask R-CNN perform them jointly. However, little research takes into account the uniqueness of the “human” category, which can be well defined by the pose skeleton. Moreover, the human pose skeleton can be used to better distinguish instances with heavy occlusion than using bounding-boxes. In this paper, we present a brand new pose-based instance segmentation framework for humans which separates instances based on human pose, rather than proposal region detection. We demonstrate that our pose-based framework can achieve better accuracy than the state-of-art detection-based approach on the human instance segmentation problem, and can moreover better handle occlusion. Furthermore, there are few public datasets containing many heavily occluded humans along with comprehensive annotations, which makes this a challenging problem seldom noticed by researchers. Therefore, in this paper we introduce a new benchmark “Occluded Human (OCHuman)", which focuses on occluded humans with comprehensive annotations including bounding-box, human pose and instance masks. This dataset contains 8110 detailed annotated human instances within 4731 images. With an average 0.67 MaxIoU for each person, OCHuman is the most complex and challenging dataset related to human instance segmentation. Through this dataset, we want to emphasize occlusion as a challenging problem for researchers to study. |
Tasks | Human Instance Segmentation, Instance Segmentation, Object Detection, Semantic Segmentation |
Published | 2018-03-28 |
URL | http://arxiv.org/abs/1803.10683v3 |
http://arxiv.org/pdf/1803.10683v3.pdf | |
PWC | https://paperswithcode.com/paper/pose2seg-detection-free-human-instance |
Repo | https://github.com/liruilong940607/OCHumanApi |
Framework | none |
Guessing Smart: Biased Sampling for Efficient Black-Box Adversarial Attacks
Title | Guessing Smart: Biased Sampling for Efficient Black-Box Adversarial Attacks |
Authors | Thomas Brunner, Frederik Diehl, Michael Truong Le, Alois Knoll |
Abstract | We consider adversarial examples for image classification in the black-box decision-based setting. Here, an attacker cannot access confidence scores, but only the final label. Most attacks for this scenario are either unreliable or inefficient. Focusing on the latter, we show that a specific class of attacks, Boundary Attacks, can be reinterpreted as a biased sampling framework that gains efficiency from domain knowledge. We identify three such biases, image frequency, regional masks and surrogate gradients, and evaluate their performance against an ImageNet classifier. We show that the combination of these biases outperforms the state of the art by a wide margin. We also showcase an efficient way to attack the Google Cloud Vision API, where we craft convincing perturbations with just a few hundred queries. Finally, the methods we propose have also been found to work very well against strong defenses: Our targeted attack won second place in the NeurIPS 2018 Adversarial Vision Challenge. |
Tasks | Image Classification |
Published | 2018-12-24 |
URL | https://arxiv.org/abs/1812.09803v3 |
https://arxiv.org/pdf/1812.09803v3.pdf | |
PWC | https://paperswithcode.com/paper/guessing-smart-biased-sampling-for-efficient |
Repo | https://github.com/ttbrunner/biased_boundary_attack_avc |
Framework | none |
Face Alignment in Full Pose Range: A 3D Total Solution
Title | Face Alignment in Full Pose Range: A 3D Total Solution |
Authors | Xiangyu Zhu, Xiaoming Liu, Zhen Lei, Stan Z. Li |
Abstract | Face alignment, which fits a face model to an image and extracts the semantic meanings of facial pixels, has been an important topic in the computer vision community. However, most algorithms are designed for faces in small to medium poses (yaw angle is smaller than 45 degrees), which lack the ability to align faces in large poses up to 90 degrees. The challenges are three-fold. Firstly, the commonly used landmark face model assumes that all the landmarks are visible and is therefore not suitable for large poses. Secondly, the face appearance varies more drastically across large poses, from the frontal view to the profile view. Thirdly, labelling landmarks in large poses is extremely challenging since the invisible landmarks have to be guessed. In this paper, we propose to tackle these three challenges in an new alignment framework termed 3D Dense Face Alignment (3DDFA), in which a dense 3D Morphable Model (3DMM) is fitted to the image via Cascaded Convolutional Neural Networks. We also utilize 3D information to synthesize face images in profile views to provide abundant samples for training. Experiments on the challenging AFLW database show that the proposed approach achieves significant improvements over the state-of-the-art methods. |
Tasks | 3D Pose Estimation, Depth Image Estimation, Face Alignment, Face Reconstruction, Pose Estimation |
Published | 2018-04-02 |
URL | http://arxiv.org/abs/1804.01005v1 |
http://arxiv.org/pdf/1804.01005v1.pdf | |
PWC | https://paperswithcode.com/paper/face-alignment-in-full-pose-range-a-3d-total |
Repo | https://github.com/nabeel3133/3D-texture-fitting |
Framework | none |