February 2, 2020

3365 words 16 mins read

Paper Group AWR 57

MetaPred: Meta-Learning for Clinical Risk Prediction with Limited Patient Electronic Health Records. A Partially Reversible U-Net for Memory-Efficient Volumetric Image Segmentation. MMDetection: Open MMLab Detection Toolbox and Benchmark. Supervise Thyself: Examining Self-Supervised Representations in Interactive Environments. DeepGCNs: Can GCNs Go …

MetaPred: Meta-Learning for Clinical Risk Prediction with Limited Patient Electronic Health Records


Title	MetaPred: Meta-Learning for Clinical Risk Prediction with Limited Patient Electronic Health Records
Authors	Xi Sheryl Zhang, Fengyi Tang, Hiroko Dodge, Jiayu Zhou, Fei Wang
Abstract	In recent years, increasingly augmentation of health data, such as patient Electronic Health Records (EHR), are becoming readily available. This provides an unprecedented opportunity for knowledge discovery and data mining algorithms to dig insights from them, which can, later on, be helpful to the improvement of the quality of care delivery. Predictive modeling of clinical risk, including in-hospital mortality, hospital readmission, chronic disease onset, condition exacerbation, etc., from patient EHR, is one of the health data analytic problems that attract most of the interests. The reason is not only because the problem is important in clinical settings, but also there are challenges working with EHR such as sparsity, irregularity, temporality, etc. Different from applications in other domains such as computer vision and natural language processing, the labeled data samples in medicine (patients) are relatively limited, which creates lots of troubles for effective predictive model learning, especially for complicated models such as deep learning. In this paper, we propose MetaPred, a meta-learning for clinical risk prediction from longitudinal patient EHRs. In particular, in order to predict the target risk where there are limited data samples, we train a meta-learner from a set of related risk prediction tasks which learns how a good predictor is learned. The meta-learned can then be directly used in target risk prediction, and the limited available samples can be used for further fine-tuning the model performance. The effectiveness of MetaPred is tested on a real patient EHR repository from Oregon Health & Science University. We are able to demonstrate that with CNN and RNN as base predictors, MetaPred can achieve much better performance for predicting target risk with low resources comparing with the predictor trained on the limited samples available for this risk.
Tasks	Meta-Learning
Published	2019-05-08
URL	https://arxiv.org/abs/1905.03218v1
PDF	https://arxiv.org/pdf/1905.03218v1.pdf
PWC	https://paperswithcode.com/paper/metapred-meta-learning-for-clinical-risk
Repo	https://github.com/sheryl-ai/MetaPred
Framework	tf

A Partially Reversible U-Net for Memory-Efficient Volumetric Image Segmentation


Title	A Partially Reversible U-Net for Memory-Efficient Volumetric Image Segmentation
Authors	Robin Brügger, Christian F. Baumgartner, Ender Konukoglu
Abstract	One of the key drawbacks of 3D convolutional neural networks for segmentation is their memory footprint, which necessitates compromises in the network architecture in order to fit into a given memory budget. Motivated by the RevNet for image classification, we propose a partially reversible U-Net architecture that reduces memory consumption substantially. The reversible architecture allows us to exactly recover each layer’s outputs from the subsequent layer’s ones, eliminating the need to store activations for backpropagation. This alleviates the biggest memory bottleneck and enables very deep (theoretically infinitely deep) 3D architectures. On the BraTS challenge dataset, we demonstrate substantial memory savings. We further show that the freed memory can be used for processing the whole field-of-view (FOV) instead of patches. Increasing network depth led to higher segmentation accuracy while growing the memory footprint only by a very small fraction, thanks to the partially reversible architecture.
Tasks	Image Classification, Medical Image Segmentation, Semantic Segmentation
Published	2019-06-14
URL	https://arxiv.org/abs/1906.06148v2
PDF	https://arxiv.org/pdf/1906.06148v2.pdf
PWC	https://paperswithcode.com/paper/a-partially-reversible-u-net-for-memory
Repo	https://github.com/RobinBruegger/RevTorch
Framework	pytorch

MMDetection: Open MMLab Detection Toolbox and Benchmark


Title	MMDetection: Open MMLab Detection Toolbox and Benchmark
Authors	Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, Zheng Zhang, Dazhi Cheng, Chenchen Zhu, Tianheng Cheng, Qijie Zhao, Buyu Li, Xin Lu, Rui Zhu, Yue Wu, Jifeng Dai, Jingdong Wang, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin
Abstract	We present MMDetection, an object detection toolbox that contains a rich set of object detection and instance segmentation methods as well as related components and modules. The toolbox started from a codebase of MMDet team who won the detection track of COCO Challenge 2018. It gradually evolves into a unified platform that covers many popular detection methods and contemporary modules. It not only includes training and inference codes, but also provides weights for more than 200 network models. We believe this toolbox is by far the most complete detection toolbox. In this paper, we introduce the various features of this toolbox. In addition, we also conduct a benchmarking study on different methods, components, and their hyper-parameters. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new detectors. Code and models are available at https://github.com/open-mmlab/mmdetection. The project is under active development and we will keep this document updated.
Tasks	Instance Segmentation, Object Detection, Semantic Segmentation
Published	2019-06-17
URL	https://arxiv.org/abs/1906.07155v1
PDF	https://arxiv.org/pdf/1906.07155v1.pdf
PWC	https://paperswithcode.com/paper/mmdetection-open-mmlab-detection-toolbox-and
Repo	https://github.com/UsmannK/cloud-mmdetection
Framework	pytorch

Supervise Thyself: Examining Self-Supervised Representations in Interactive Environments


Title	Supervise Thyself: Examining Self-Supervised Representations in Interactive Environments
Authors	Evan Racah, Christopher Pal
Abstract	Self-supervised methods, wherein an agent learns representations solely by observing the results of its actions, become crucial in environments which do not provide a dense reward signal or have labels. In most cases, such methods are used for pretraining or auxiliary tasks for “downstream” tasks, such as control, exploration, or imitation learning. However, it is not clear which method’s representations best capture meaningful features of the environment, and which are best suited for which types of environments. We present a small-scale study of self-supervised methods on two visual environments: Flappy Bird and Sonic The Hedgehog. In particular, we quantitatively evaluate the representations learned from these tasks in two contexts: a) the extent to which the representations capture true state information of the agent and b) how generalizable these representations are to novel situations, like new levels and textures. Lastly, we evaluate these self-supervised features by visualizing which parts of the environment they focus on. Our results show that the utility of the representations is highly dependent on the visuals and dynamics of the environment.
Tasks	Imitation Learning
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11951v1
PDF	https://arxiv.org/pdf/1906.11951v1.pdf
PWC	https://paperswithcode.com/paper/supervise-thyself-examining-self-supervised
Repo	https://github.com/eracah/supervise-thyself
Framework	pytorch

DeepGCNs: Can GCNs Go as Deep as CNNs?


Title	DeepGCNs: Can GCNs Go as Deep as CNNs?
Authors	Guohao Li, Matthias Müller, Ali Thabet, Bernard Ghanem
Abstract	Convolutional Neural Networks (CNNs) achieve impressive performance in a wide variety of fields. Their success benefited from a massive boost when very deep CNN models were able to be reliably trained. Despite their merits, CNNs fail to properly address problems with non-Euclidean data. To overcome this challenge, Graph Convolutional Networks (GCNs) build graphs to represent non-Euclidean data, borrow concepts from CNNs, and apply them in training. GCNs show promising results, but they are usually limited to very shallow models due to the vanishing gradient problem. As a result, most state-of-the-art GCN models are no deeper than 3 or 4 layers. In this work, we present new ways to successfully train very deep GCNs. We do this by borrowing concepts from CNNs, specifically residual/dense connections and dilated convolutions, and adapting them to GCN architectures. Extensive experiments show the positive effect of these deep GCN frameworks. Finally, we use these new concepts to build a very deep 56-layer GCN, and show how it significantly boosts performance (+3.7% mIoU over state-of-the-art) in the task of point cloud semantic segmentation. We believe that the community can greatly benefit from this work, as it opens up many opportunities for advancing GCN-based research.
Tasks	3D Semantic Segmentation, Semantic Segmentation
Published	2019-04-07
URL	https://arxiv.org/abs/1904.03751v2
PDF	https://arxiv.org/pdf/1904.03751v2.pdf
PWC	https://paperswithcode.com/paper/can-gcns-go-as-deep-as-cnns
Repo	https://github.com/lightaime/deep_gcns
Framework	tf

Causal Confusion in Imitation Learning


Title	Causal Confusion in Imitation Learning
Authors	Pim de Haan, Dinesh Jayaraman, Sergey Levine
Abstract	Behavioral cloning reduces policy learning to supervised learning by training a discriminative model to predict expert actions given observations. Such discriminative models are non-causal: the training procedure is unaware of the causal structure of the interaction between the expert and the environment. We point out that ignoring causality is particularly damaging because of the distributional shift in imitation learning. In particular, it leads to a counter-intuitive “causal misidentification” phenomenon: access to more information can yield worse performance. We investigate how this problem arises, and propose a solution to combat it through targeted interventions—either environment interaction or expert queries—to determine the correct causal model. We show that causal misidentification occurs in several benchmark control domains as well as realistic driving settings, and validate our solution against DAgger and other baselines and ablations.
Tasks	Imitation Learning
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11979v2
PDF	https://arxiv.org/pdf/1905.11979v2.pdf
PWC	https://paperswithcode.com/paper/causal-confusion-in-imitation-learning
Repo	https://github.com/pimdh/causal-confusion
Framework	pytorch

N-BEATS: Neural basis expansion analysis for interpretable time series forecasting


Title	N-BEATS: Neural basis expansion analysis for interpretable time series forecasting
Authors	Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, Yoshua Bengio
Abstract	We focus on solving the univariate times series point forecasting problem using deep learning. We propose a deep neural architecture based on backward and forward residual links and a very deep stack of fully-connected layers. The architecture has a number of desirable properties, being interpretable, applicable without modification to a wide array of target domains, and fast to train. We test the proposed architecture on several well-known datasets, including M3, M4 and TOURISM competition datasets containing time series from diverse domains. We demonstrate state-of-the-art performance for two configurations of N-BEATS for all the datasets, improving forecast accuracy by 11% over a statistical benchmark and by 3% over last year’s winner of the M4 competition, a domain-adjusted hand-crafted hybrid between neural network and statistical time series models. The first configuration of our model does not employ any time-series-specific components and its performance on heterogeneous datasets strongly suggests that, contrarily to received wisdom, deep learning primitives such as residual blocks are by themselves sufficient to solve a wide range of forecasting problems. Finally, we demonstrate how the proposed architecture can be augmented to provide outputs that are interpretable without considerable loss in accuracy.
Tasks	Time Series, Time Series Forecasting
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10437v4
PDF	https://arxiv.org/pdf/1905.10437v4.pdf
PWC	https://paperswithcode.com/paper/n-beats-neural-basis-expansion-analysis-for
Repo	https://github.com/amitesh863/nbeats_forecast
Framework	pytorch

Joint Ranking SVM and Binary Relevance with Robust Low-Rank Learning for Multi-Label Classification


Title	Joint Ranking SVM and Binary Relevance with Robust Low-Rank Learning for Multi-Label Classification
Authors	Guoqiang Wu, Ruobing Zheng, Yingjie Tian, Dalian Liu
Abstract	Multi-label classification studies the task where each example belongs to multiple labels simultaneously. As a representative method, Ranking Support Vector Machine (Rank-SVM) aims to minimize the Ranking Loss and can also mitigate the negative influence of the class-imbalance issue. However, due to its stacking-style way for thresholding, it may suffer error accumulation and thus reduces the final classification performance. Binary Relevance (BR) is another typical method, which aims to minimize the Hamming Loss and only needs one-step learning. Nevertheless, it might have the class-imbalance issue and does not take into account label correlations. To address the above issues, we propose a novel multi-label classification model, which joints Ranking support vector machine and Binary Relevance with robust Low-rank learning (RBRL). RBRL inherits the ranking loss minimization advantages of Rank-SVM, and thus overcomes the disadvantages of BR suffering the class-imbalance issue and ignoring the label correlations. Meanwhile, it utilizes the hamming loss minimization and one-step learning advantages of BR, and thus tackles the disadvantages of Rank-SVM including another thresholding learning step. Besides, a low-rank constraint is utilized to further exploit high-order label correlations under the assumption of low dimensional label space. Furthermore, to achieve nonlinear multi-label classifiers, we derive the kernelization RBRL. Two accelerated proximal gradient methods (APG) are used to solve the optimization problems efficiently. Extensive comparative experiments with several state-of-the-art methods illustrate a highly competitive or superior performance of our method RBRL.
Tasks	Multi-Label Classification
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01658v1
PDF	https://arxiv.org/pdf/1911.01658v1.pdf
PWC	https://paperswithcode.com/paper/joint-ranking-svm-and-binary-relevance-with
Repo	https://github.com/GuoqiangWoodrowWu/RBRL
Framework	none

GlobalTrack: A Simple and Strong Baseline for Long-term Tracking


Title	GlobalTrack: A Simple and Strong Baseline for Long-term Tracking
Authors	Lianghua Huang, Xin Zhao, Kaiqi Huang
Abstract	A key capability of a long-term tracker is to search for targets in very large areas (typically the entire image) to handle possible target absences or tracking failures. However, currently there is a lack of such a strong baseline for global instance search. In this work, we aim to bridge this gap. Specifically, we propose GlobalTrack, a pure global instance search based tracker that makes no assumption on the temporal consistency of the target’s positions and scales. GlobalTrack is developed based on two-stage object detectors, and it is able to perform full-image and multi-scale search of arbitrary instances with only a single query as the guide. We further propose a cross-query loss to improve the robustness of our approach against distractors. With no online learning, no punishment on position or scale changes, no scale smoothing and no trajectory refinement, our pure global instance search based tracker achieves comparable, sometimes much better performance on four large-scale tracking benchmarks (i.e., 52.1% AUC on LaSOT, 63.8% success rate on TLP, 60.3% MaxGM on OxUvA and 75.4% normalized precision on TrackingNet), compared to state-of-the-art approaches that typically require complex post-processing. More importantly, our tracker runs without cumulative errors, i.e., any type of temporary tracking failures will not affect its performance on future frames, making it ideal for long-term tracking. We hope this work will be a strong baseline for long-term tracking and will stimulate future works in this area. Code is available at https://github.com/huanglianghua/GlobalTrack.
Tasks	Instance Search
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08531v1
PDF	https://arxiv.org/pdf/1912.08531v1.pdf
PWC	https://paperswithcode.com/paper/globaltrack-a-simple-and-strong-baseline-for
Repo	https://github.com/huanglianghua/GlobalTrack
Framework	pytorch

Single-Forward-Step Projective Splitting: Exploiting Cocoercivity


Title	Single-Forward-Step Projective Splitting: Exploiting Cocoercivity
Authors	Patrick R. Johnstone, Jonathan Eckstein
Abstract	This work describes a new variant of projective splitting for monotone inclusions, in which cocoercive operators can be processed with a single forward step per iteration. This result establishes a symmetry between projective splitting algorithms, the classical forward-backward splitting method (FB), and Tseng’s forward-backward-forward method (FBF). Another symmetry is that the new procedure allows for larger stepsizes for cocoercive operators: the stepsize bound is $2\beta$ for a $\beta$-cocoercive operator, which is the same as for FB. To complete the connection, we show that FB corresponds to an unattainable boundary case of the parameters in the new procedure. Unlike FB, the new method allows for a backtracking procedure when the cocoercivity constant is unknown. Proving convergence of the algorithm requires some departures from the usual proof framework for projective splitting. We close with some computational tests establishing competitive performance for the method.
Tasks
Published	2019-02-24
URL	https://arxiv.org/abs/1902.09025v2
PDF	https://arxiv.org/pdf/1902.09025v2.pdf
PWC	https://paperswithcode.com/paper/single-forward-step-projective-splitting
Repo	https://github.com/projective-splitting/coco
Framework	none

CASTER: Predicting Drug Interactions with Chemical Substructure Representation


Title	CASTER: Predicting Drug Interactions with Chemical Substructure Representation
Authors	Kexin Huang, Cao Xiao, Trong Nghia Hoang, Lucas M. Glass, Jimeng Sun
Abstract	Adverse drug-drug interactions (DDIs) remain a leading cause of morbidity and mortality. Identifying potential DDIs during the drug design process is critical for patients and society. Although several computational models have been proposed for DDI prediction, there are still limitations: (1) specialized design of drug representation for DDI predictions is lacking; (2) predictions are based on limited labelled data and do not generalize well to unseen drugs or DDIs; and (3) models are characterized by a large number of parameters, thus are hard to interpret. In this work, we develop a ChemicAl SubstrucTurE Representation (CASTER) framework that predicts DDIs given chemical structures of drugs.CASTER aims to mitigate these limitations via (1) a sequential pattern mining module rooted in the DDI mechanism to efficiently characterize functional sub-structures of drugs; (2) an auto-encoding module that leverages both labelled and unlabelled chemical structure data to improve predictive accuracy and generalizability; and (3) a dictionary learning module that explains the prediction via a small set of coefficients which measure the relevance of each input sub-structures to the DDI outcome. We evaluated CASTER on two real-world DDI datasets and showed that it performed better than state-of-the-art baselines and provided interpretable predictions.
Tasks	Dictionary Learning, Sequential Pattern Mining
Published	2019-11-15
URL	https://arxiv.org/abs/1911.06446v2
PDF	https://arxiv.org/pdf/1911.06446v2.pdf
PWC	https://paperswithcode.com/paper/caster-predicting-drug-interactions-with
Repo	https://github.com/kexinhuang12345/CASTER
Framework	none

BERT for Coreference Resolution: Baselines and Analysis


Title	BERT for Coreference Resolution: Baselines and Analysis
Authors	Mandar Joshi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer
Abstract	We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3.9 F1) and GAP (+11.5 F1) benchmarks. A qualitative analysis of model predictions indicates that, compared to ELMo and BERT-base, BERT-large is particularly better at distinguishing between related but distinct entities (e.g., President and CEO). However, there is still room for improvement in modeling document-level context, conversations, and mention paraphrasing. Our code and models are publicly available.
Tasks	Coreference Resolution
Published	2019-08-24
URL	https://arxiv.org/abs/1908.09091v4
PDF	https://arxiv.org/pdf/1908.09091v4.pdf
PWC	https://paperswithcode.com/paper/bert-for-coreference-resolution-baselines-and
Repo	https://github.com/mandarjoshi90/coref
Framework	tf

Regularizing Activation Distribution for Training Binarized Deep Networks


Title	Regularizing Activation Distribution for Training Binarized Deep Networks
Authors	Ruizhou Ding, Ting-Wu Chin, Zeye Liu, Diana Marculescu
Abstract	Binarized Neural Networks (BNNs) can significantly reduce the inference latency and energy consumption in resource-constrained devices due to their pure-logical computation and fewer memory accesses. However, training BNNs is difficult since the activation flow encounters degeneration, saturation, and gradient mismatch problems. Prior work alleviates these issues by increasing activation bits and adding floating-point scaling factors, thereby sacrificing BNN’s energy efficiency. In this paper, we propose to use distribution loss to explicitly regularize the activation flow, and develop a framework to systematically formulate the loss. Our experiments show that the distribution loss can consistently improve the accuracy of BNNs without losing their energy benefits. Moreover, equipped with the proposed regularization, BNN training is shown to be robust to the selection of hyper-parameters including optimizer and learning rate.
Tasks
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02823v1
PDF	http://arxiv.org/pdf/1904.02823v1.pdf
PWC	https://paperswithcode.com/paper/regularizing-activation-distribution-for
Repo	https://github.com/ruizhoud/DistributionLoss
Framework	pytorch

Bengali Handwritten Character Classification using Transfer Learning on Deep Convolutional Neural Network


Title	Bengali Handwritten Character Classification using Transfer Learning on Deep Convolutional Neural Network
Authors	Swagato Chatterjee, Rwik Kumar Dutta, Debayan Ganguly, Kingshuk Chatterjee, Sudipta Roy
Abstract	In this paper, we propose a solution which uses state-of-the-art techniques in Deep Learning to tackle the problem of Bengali Handwritten Character Recognition ( HCR ). Our method uses lesser iterations to train than most other comparable methods. We employ Transfer Learning on ResNet 50, a state-of-the-art deep Convolutional Neural Network Model, pretrained on ImageNet dataset. We also use other techniques like a modified version of One Cycle Policy, varying the input image sizes etc. to ensure that our training occurs fast. We use the BanglaLekha-Isolated Dataset for evaluation of our technique which consists of 84 classes (50 Basic, 10 Numerals and 24 Compound Characters). We are able to achieve 96.12% accuracy in just 47 epochs on BanglaLekha-Isolated dataset. When comparing our method with that of other researchers, considering number of classes and without using Ensemble Learning, the proposed solution achieves state of the art result for Handwritten Bengali Character Recognition. Code and weight files are available at https://github.com/swagato-c/bangla-hwcr-present.
Tasks	Transfer Learning
Published	2019-02-25
URL	http://arxiv.org/abs/1902.11133v1
PDF	http://arxiv.org/pdf/1902.11133v1.pdf
PWC	https://paperswithcode.com/paper/bengali-handwritten-character-classification
Repo	https://github.com/swagato-c/bangla-hwcr-present
Framework	pytorch

Modeling Combinatorial Evolution in Time Series Prediction


Title	Modeling Combinatorial Evolution in Time Series Prediction
Authors	Wenjie Hu, Yang Yang, Zilong You, Zongtao Liu, Xiang Ren
Abstract	Time series modeling aims to capture the intrinsic factors underpinning observed data and its evolution. However, most existing studies ignore the evolutionary relations among these factors, which are what cause the combinatorial evolution of a given time series. In this paper, we propose to represent time-varying relations among intrinsic factors of time series data by means of an evolutionary state graph structure. Accordingly, we propose the Evolutionary Graph Recurrent Networks (EGRN) to learn representations of these factors, along with the given time series, using a graph neural network framework. The learned representations can then be applied to time series classification tasks. From our experiment results, based on six real-world datasets, it can be seen that our approach clearly outperforms ten state-of-the-art baseline methods (e.g. +5% in terms of accuracy, and +15% in terms of F1 on average). In addition, we demonstrate that due to the graph structure’s improved interpretability, our method is also able to explain the logical causes of the predicted events.
Tasks	Time Series, Time Series Classification, Time Series Prediction
Published	2019-05-10
URL	https://arxiv.org/abs/1905.05006v2
PDF	https://arxiv.org/pdf/1905.05006v2.pdf
PWC	https://paperswithcode.com/paper/190505006
Repo	https://github.com/VachelHU/ESGRN
Framework	tf