October 20, 2019

3179 words 15 mins read

Paper Group AWR 290

Neural Ordinary Differential Equations. Scalable and accurate deep learning for electronic health records. Deep Vessel Segmentation By Learning Graphical Connectivity. Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond. Extracting Sentiment Attitudes From Analytical Texts. Attend, Copy, Parse – End-to-end in …

Neural Ordinary Differential Equations


Title	Neural Ordinary Differential Equations
Authors	Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud
Abstract	We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.
Tasks	Latent Variable Models, Multivariate Time Series Forecasting, Multivariate Time Series Imputation
Published	2018-06-19
URL	https://arxiv.org/abs/1806.07366v5
PDF	https://arxiv.org/pdf/1806.07366v5.pdf
PWC	https://paperswithcode.com/paper/neural-ordinary-differential-equations
Repo	https://github.com/esghif/torchdiffeq
Framework	pytorch

Scalable and accurate deep learning for electronic health records


Title	Scalable and accurate deep learning for electronic health records
Authors	Alvin Rajkomar, Eyal Oren, Kai Chen, Andrew M. Dai, Nissan Hajaj, Peter J. Liu, Xiaobing Liu, Mimi Sun, Patrik Sundberg, Hector Yee, Kun Zhang, Gavin E. Duggan, Gerardo Flores, Michaela Hardt, Jamie Irvine, Quoc Le, Kurt Litsch, Jake Marcus, Alexander Mossin, Justin Tansuwan, De Wang, James Wexler, Jimbo Wilson, Dana Ludwig, Samuel L. Volchenboum, Katherine Chou, Michael Pearson, Srinivasan Madabushi, Nigam H. Shah, Atul J. Butte, Michael Howell, Claire Cui, Greg Corrado, Jeff Dean
Abstract	Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient’s record. We propose a representation of patients’ entire, raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two U.S. academic medical centers with 216,221 adult patients hospitalized for at least 24 hours. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting in-hospital mortality (AUROC across sites 0.93-0.94), 30-day unplanned readmission (AUROC 0.75-0.76), prolonged length of stay (AUROC 0.85-0.86), and all of a patient’s final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed state-of-the-art traditional predictive models in all cases. We also present a case-study of a neural-network attribution system, which illustrates how clinicians can gain some transparency into the predictions. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios, complete with explanations that directly highlight evidence in the patient’s chart.
Tasks
Published	2018-01-24
URL	http://arxiv.org/abs/1801.07860v3
PDF	http://arxiv.org/pdf/1801.07860v3.pdf
PWC	https://paperswithcode.com/paper/scalable-and-accurate-deep-learning-for
Repo	https://github.com/nwams/Predicting-Hospital-Readmission-using-NLP
Framework	none

Deep Vessel Segmentation By Learning Graphical Connectivity


Title	Deep Vessel Segmentation By Learning Graphical Connectivity
Authors	Seung Yeon Shin, Soochahn Lee, Il Dong Yun, Kyoung Mu Lee
Abstract	We propose a novel deep-learning-based system for vessel segmentation. Existing methods using CNNs have mostly relied on local appearances learned on the regular image grid, without considering the graphical structure of vessel shape. To address this, we incorporate a graph convolutional network into a unified CNN architecture, where the final segmentation is inferred by combining the different types of features. The proposed method can be applied to expand any type of CNN-based vessel segmentation method to enhance the performance. Experiments show that the proposed method outperforms the current state-of-the-art methods on two retinal image datasets as well as a coronary artery X-ray angiography dataset.
Tasks	Retinal Vessel Segmentation
Published	2018-06-06
URL	http://arxiv.org/abs/1806.02279v1
PDF	http://arxiv.org/pdf/1806.02279v1.pdf
PWC	https://paperswithcode.com/paper/deep-vessel-segmentation-by-learning
Repo	https://github.com/syshin1014/VGN
Framework	tf

Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond


Title	Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond
Authors	Krishna Kumar Singh, Hao Yu, Aron Sarmasi, Gautam Pradeep, Yong Jae Lee
Abstract	We propose ‘Hide-and-Seek’ a general purpose data augmentation technique, which is complementary to existing data augmentation techniques and is beneficial for various visual recognition tasks. The key idea is to hide patches in a training image randomly, in order to force the network to seek other relevant content when the most discriminative content is hidden. Our approach only needs to modify the input image and can work with any network to improve its performance. During testing, it does not need to hide any patches. The main advantage of Hide-and-Seek over existing data augmentation techniques is its ability to improve object localization accuracy in the weakly-supervised setting, and we therefore use this task to motivate the approach. However, Hide-and-Seek is not tied only to the image localization task, and can generalize to other forms of visual input like videos, as well as other recognition tasks like image classification, temporal action localization, semantic segmentation, emotion recognition, age/gender estimation, and person re-identification. We perform extensive experiments to showcase the advantage of Hide-and-Seek on these various visual recognition problems.
Tasks	Action Localization, Data Augmentation, Emotion Recognition, Image Classification, Object Localization, Person Re-Identification, Semantic Segmentation, Temporal Action Localization
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02545v1
PDF	http://arxiv.org/pdf/1811.02545v1.pdf
PWC	https://paperswithcode.com/paper/hide-and-seek-a-data-augmentation-technique
Repo	https://github.com/kkanshul/Hide-and-Seek
Framework	pytorch

Extracting Sentiment Attitudes From Analytical Texts


Title	Extracting Sentiment Attitudes From Analytical Texts
Authors	Natalia Loukachevitch, Nicolay Rusnachenko
Abstract	In this paper we present the RuSentRel corpus including analytical texts in the sphere of international relations. For each document we annotated sentiments from the author to mentioned named entities, and sentiments of relations between mentioned entities. In the current experiments, we considered the problem of extracting sentiment relations between entities for the whole documents as a three-class machine learning task. We experimented with conventional machine-learning methods (Naive Bayes, SVM, Random Forest).
Tasks
Published	2018-08-27
URL	http://arxiv.org/abs/1808.08932v1
PDF	http://arxiv.org/pdf/1808.08932v1.pdf
PWC	https://paperswithcode.com/paper/extracting-sentiment-attitudes-from
Repo	https://github.com/nicolay-r/RuSentRel
Framework	none

Attend, Copy, Parse – End-to-end information extraction from documents


Title	Attend, Copy, Parse – End-to-end information extraction from documents
Authors	Rasmus Berg Palm, Florian Laws, Ole Winther
Abstract	Document information extraction tasks performed by humans create data consisting of a PDF or document image input, and extracted string outputs. This end-to-end data is naturally consumed and produced when performing the task because it is valuable in and of itself. It is naturally available, at no additional cost. Unfortunately, state-of-the-art word classification methods for information extraction cannot use this data, instead requiring word-level labels which are expensive to create and consequently not available for many real life tasks. In this paper we propose the Attend, Copy, Parse architecture, a deep neural network model that can be trained directly on end-to-end data, bypassing the need for word-level labels. We evaluate the proposed architecture on a large diverse set of invoices, and outperform a state-of-the-art production system based on word classification. We believe our proposed architecture can be used on many real life information extraction tasks where word classification cannot be used due to a lack of the required word-level labels.
Tasks
Published	2018-12-18
URL	https://arxiv.org/abs/1812.07248v2
PDF	https://arxiv.org/pdf/1812.07248v2.pdf
PWC	https://paperswithcode.com/paper/attend-copy-parse-end-to-end-information
Repo	https://github.com/Tradeshift/attend-copy-parse
Framework	tf

PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image


Title	PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image
Authors	Chen Liu, Jimei Yang, Duygu Ceylan, Ersin Yumer, Yasutaka Furukawa
Abstract	This paper proposes a deep neural network (DNN) for piece-wise planar depthmap reconstruction from a single RGB image. While DNNs have brought remarkable progress to single-image depth prediction, piece-wise planar depthmap reconstruction requires a structured geometry representation, and has been a difficult task to master even for DNNs. The proposed end-to-end DNN learns to directly infer a set of plane parameters and corresponding plane segmentation masks from a single RGB image. We have generated more than 50,000 piece-wise planar depthmaps for training and testing from ScanNet, a large-scale RGBD video database. Our qualitative and quantitative evaluations demonstrate that the proposed approach outperforms baseline methods in terms of both plane segmentation and depth estimation accuracy. To the best of our knowledge, this paper presents the first end-to-end neural architecture for piece-wise planar reconstruction from a single RGB image. Code and data are available at https://github.com/art-programmer/PlaneNet.
Tasks	Depth Estimation
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06278v1
PDF	http://arxiv.org/pdf/1804.06278v1.pdf
PWC	https://paperswithcode.com/paper/planenet-piece-wise-planar-reconstruction
Repo	https://github.com/art-programmer/PlaneNet
Framework	tf

Pixel-wise Attentional Gating for Parsimonious Pixel Labeling


Title	Pixel-wise Attentional Gating for Parsimonious Pixel Labeling
Authors	Shu Kong, Charless Fowlkes
Abstract	To achieve parsimonious inference in per-pixel labeling tasks with a limited computational budget, we propose a \emph{Pixel-wise Attentional Gating} unit (\emph{PAG}) that learns to selectively process a subset of spatial locations at each layer of a deep convolutional network. PAG is a generic, architecture-independent, problem-agnostic mechanism that can be readily “plugged in” to an existing model with fine-tuning. We utilize PAG in two ways: 1) learning spatially varying pooling fields that improve model performance without the extra computation cost associated with multi-scale pooling, and 2) learning a dynamic computation policy for each pixel to decrease total computation while maintaining accuracy. We extensively evaluate PAG on a variety of per-pixel labeling tasks, including semantic segmentation, boundary detection, monocular depth and surface normal estimation. We demonstrate that PAG allows competitive or state-of-the-art performance on these tasks. Our experiments show that PAG learns dynamic spatial allocation of computation over the input image which provides better performance trade-offs compared to related approaches (e.g., truncating deep models or dynamically skipping whole layers). Generally, we observe PAG can reduce computation by $10%$ without noticeable loss in accuracy and performance degrades gracefully when imposing stronger computational constraints.
Tasks	Boundary Detection, Semantic Segmentation
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01556v2
PDF	http://arxiv.org/pdf/1805.01556v2.pdf
PWC	https://paperswithcode.com/paper/pixel-wise-attentional-gating-for
Repo	https://github.com/aimerykong/Pixel-Attentional-Gating
Framework	none

Self-supervised Learning of Dense Shape Correspondence


Title	Self-supervised Learning of Dense Shape Correspondence
Authors	Oshri Halimi, Or Litany, Emanuele Rodolà, Alex Bronstein, Ron Kimmel
Abstract	We introduce the first completely unsupervised correspondence learning approach for deformable 3D shapes. Key to our model is the understanding that natural deformations (such as changes in pose) approximately preserve the metric structure of the surface, yielding a natural criterion to drive the learning process toward distortion-minimizing predictions. On this basis, we overcome the need for annotated data and replace it by a purely geometric criterion. The resulting learning model is class-agnostic, and is able to leverage any type of deformable geometric data for the training phase. In contrast to existing supervised approaches which specialize on the class seen at training time, we demonstrate stronger generalization as well as applicability to a variety of challenging settings. We showcase our method on a wide selection of correspondence benchmarks, where we outperform other methods in terms of accuracy, generalization, and efficiency.
Tasks
Published	2018-12-06
URL	http://arxiv.org/abs/1812.02415v1
PDF	http://arxiv.org/pdf/1812.02415v1.pdf
PWC	https://paperswithcode.com/paper/self-supervised-learning-of-dense-shape
Repo	https://github.com/OshriHalimi/unsupervised_learning_of_dense_shape_correspondence
Framework	none

Abduction-Based Explanations for Machine Learning Models


Title	Abduction-Based Explanations for Machine Learning Models
Authors	Alexey Ignatiev, Nina Narodytska, Joao Marques-Silva
Abstract	The growing range of applications of Machine Learning (ML) in a multitude of settings motivates the ability of computing small explanations for predictions made. Small explanations are generally accepted as easier for human decision makers to understand. Most earlier work on computing explanations is based on heuristic approaches, providing no guarantees of quality, in terms of how close such solutions are from cardinality- or subset-minimal explanations. This paper develops a constraint-agnostic solution for computing explanations for any ML model. The proposed solution exploits abductive reasoning, and imposes the requirement that the ML model can be represented as sets of constraints using some target constraint reasoning system for which the decision problem can be answered with some oracle. The experimental results, obtained on well-known datasets, validate the scalability of the proposed approach as well as the quality of the computed solutions.
Tasks
Published	2018-11-26
URL	http://arxiv.org/abs/1811.10656v1
PDF	http://arxiv.org/pdf/1811.10656v1.pdf
PWC	https://paperswithcode.com/paper/abduction-based-explanations-for-machine
Repo	https://github.com/alexeyignatiev/xplainer
Framework	none

An improved neural network model for joint POS tagging and dependency parsing


Title	An improved neural network model for joint POS tagging and dependency parsing
Authors	Dat Quoc Nguyen, Karin Verspoor
Abstract	We propose a novel neural network model for joint part-of-speech (POS) tagging and dependency parsing. Our model extends the well-known BIST graph-based dependency parser (Kiperwasser and Goldberg, 2016) by incorporating a BiLSTM-based tagging component to produce automatically predicted POS tags for the parser. On the benchmark English Penn treebank, our model obtains strong UAS and LAS scores at 94.51% and 92.87%, respectively, producing 1.5+% absolute improvements to the BIST graph-based parser, and also obtaining a state-of-the-art POS tagging accuracy at 97.97%. Furthermore, experimental results on parsing 61 “big” Universal Dependencies treebanks from raw texts show that our model outperforms the baseline UDPipe (Straka and Strakov'a, 2017) with 0.8% higher average POS tagging score and 3.6% higher average LAS score. In addition, with our model, we also obtain state-of-the-art downstream task scores for biomedical event extraction and opinion analysis applications. Our code is available together with all pre-trained models at: https://github.com/datquocnguyen/jPTDP
Tasks	Dependency Parsing, Part-Of-Speech Tagging
Published	2018-07-11
URL	http://arxiv.org/abs/1807.03955v2
PDF	http://arxiv.org/pdf/1807.03955v2.pdf
PWC	https://paperswithcode.com/paper/an-improved-neural-network-model-for-joint
Repo	https://github.com/datquocnguyen/jPTDP
Framework	none

Style Augmentation: Data Augmentation via Style Randomization


Title	Style Augmentation: Data Augmentation via Style Randomization
Authors	Philip T. Jackson, Amir Atapour-Abarghouei, Stephen Bonner, Toby Breckon, Boguslaw Obara
Abstract	We introduce style augmentation, a new form of data augmentation based on random style transfer, for improving the robustness of convolutional neural networks (CNN) over both classification and regression based tasks. During training, our style augmentation randomizes texture, contrast and color, while preserving shape and semantic content. This is accomplished by adapting an arbitrary style transfer network to perform style randomization, by sampling input style embeddings from a multivariate normal distribution instead of inferring them from a style image. In addition to standard classification experiments, we investigate the effect of style augmentation (and data augmentation generally) on domain transfer tasks. We find that data augmentation significantly improves robustness to domain shift, and can be used as a simple, domain agnostic alternative to domain adaptation. Comparing style augmentation against a mix of seven traditional augmentation techniques, we find that it can be readily combined with them to improve network performance. We validate the efficacy of our technique with domain transfer experiments in classification and monocular depth estimation, illustrating consistent improvements in generalization.
Tasks	Data Augmentation, Depth Estimation, Domain Adaptation, Monocular Depth Estimation, Style Transfer
Published	2018-09-14
URL	http://arxiv.org/abs/1809.05375v2
PDF	http://arxiv.org/pdf/1809.05375v2.pdf
PWC	https://paperswithcode.com/paper/style-augmentation-data-augmentation-via
Repo	https://github.com/philipjackson/style-augmentation
Framework	pytorch

Pose2Seg: Detection Free Human Instance Segmentation


Title	Pose2Seg: Detection Free Human Instance Segmentation
Authors	Song-Hai Zhang, Ruilong Li, Xin Dong, Paul L. Rosin, Zixi Cai, Han Xi, Dingcheng Yang, Hao-Zhi Huang, Shi-Min Hu
Abstract	The standard approach to image instance segmentation is to perform the object detection first, and then segment the object from the detection bounding-box. More recently, deep learning methods like Mask R-CNN perform them jointly. However, little research takes into account the uniqueness of the “human” category, which can be well defined by the pose skeleton. Moreover, the human pose skeleton can be used to better distinguish instances with heavy occlusion than using bounding-boxes. In this paper, we present a brand new pose-based instance segmentation framework for humans which separates instances based on human pose, rather than proposal region detection. We demonstrate that our pose-based framework can achieve better accuracy than the state-of-art detection-based approach on the human instance segmentation problem, and can moreover better handle occlusion. Furthermore, there are few public datasets containing many heavily occluded humans along with comprehensive annotations, which makes this a challenging problem seldom noticed by researchers. Therefore, in this paper we introduce a new benchmark “Occluded Human (OCHuman)", which focuses on occluded humans with comprehensive annotations including bounding-box, human pose and instance masks. This dataset contains 8110 detailed annotated human instances within 4731 images. With an average 0.67 MaxIoU for each person, OCHuman is the most complex and challenging dataset related to human instance segmentation. Through this dataset, we want to emphasize occlusion as a challenging problem for researchers to study.
Tasks	Human Instance Segmentation, Instance Segmentation, Object Detection, Semantic Segmentation
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10683v3
PDF	http://arxiv.org/pdf/1803.10683v3.pdf
PWC	https://paperswithcode.com/paper/pose2seg-detection-free-human-instance
Repo	https://github.com/liruilong940607/OCHumanApi
Framework	none

Guessing Smart: Biased Sampling for Efficient Black-Box Adversarial Attacks


Title	Guessing Smart: Biased Sampling for Efficient Black-Box Adversarial Attacks
Authors	Thomas Brunner, Frederik Diehl, Michael Truong Le, Alois Knoll
Abstract	We consider adversarial examples for image classification in the black-box decision-based setting. Here, an attacker cannot access confidence scores, but only the final label. Most attacks for this scenario are either unreliable or inefficient. Focusing on the latter, we show that a specific class of attacks, Boundary Attacks, can be reinterpreted as a biased sampling framework that gains efficiency from domain knowledge. We identify three such biases, image frequency, regional masks and surrogate gradients, and evaluate their performance against an ImageNet classifier. We show that the combination of these biases outperforms the state of the art by a wide margin. We also showcase an efficient way to attack the Google Cloud Vision API, where we craft convincing perturbations with just a few hundred queries. Finally, the methods we propose have also been found to work very well against strong defenses: Our targeted attack won second place in the NeurIPS 2018 Adversarial Vision Challenge.
Tasks	Image Classification
Published	2018-12-24
URL	https://arxiv.org/abs/1812.09803v3
PDF	https://arxiv.org/pdf/1812.09803v3.pdf
PWC	https://paperswithcode.com/paper/guessing-smart-biased-sampling-for-efficient
Repo	https://github.com/ttbrunner/biased_boundary_attack_avc
Framework	none

Face Alignment in Full Pose Range: A 3D Total Solution


Title	Face Alignment in Full Pose Range: A 3D Total Solution
Authors	Xiangyu Zhu, Xiaoming Liu, Zhen Lei, Stan Z. Li
Abstract	Face alignment, which fits a face model to an image and extracts the semantic meanings of facial pixels, has been an important topic in the computer vision community. However, most algorithms are designed for faces in small to medium poses (yaw angle is smaller than 45 degrees), which lack the ability to align faces in large poses up to 90 degrees. The challenges are three-fold. Firstly, the commonly used landmark face model assumes that all the landmarks are visible and is therefore not suitable for large poses. Secondly, the face appearance varies more drastically across large poses, from the frontal view to the profile view. Thirdly, labelling landmarks in large poses is extremely challenging since the invisible landmarks have to be guessed. In this paper, we propose to tackle these three challenges in an new alignment framework termed 3D Dense Face Alignment (3DDFA), in which a dense 3D Morphable Model (3DMM) is fitted to the image via Cascaded Convolutional Neural Networks. We also utilize 3D information to synthesize face images in profile views to provide abundant samples for training. Experiments on the challenging AFLW database show that the proposed approach achieves significant improvements over the state-of-the-art methods.
Tasks	3D Pose Estimation, Depth Image Estimation, Face Alignment, Face Reconstruction, Pose Estimation
Published	2018-04-02
URL	http://arxiv.org/abs/1804.01005v1
PDF	http://arxiv.org/pdf/1804.01005v1.pdf
PWC	https://paperswithcode.com/paper/face-alignment-in-full-pose-range-a-3d-total
Repo	https://github.com/nabeel3133/3D-texture-fitting
Framework	none