January 29, 2020

3205 words 16 mins read

Paper Group ANR 722

Improved Document Modelling with a Neural Discourse Parser. Segmentation of Lumen and External Elastic Laminae in Intravascular Ultrasound Images using Ultrasonic Backscattering Physics Initialized Multiscale Random Walks. Analysis of Software Engineering for Agile Machine Learning Projects. Ludwig: a type-based declarative deep learning toolbox. D …

Improved Document Modelling with a Neural Discourse Parser


Title	Improved Document Modelling with a Neural Discourse Parser
Authors	Fajri Koto, Jey Han Lau, Timothy Baldwin
Abstract	Despite the success of attention-based neural models for natural language generation and classification tasks, they are unable to capture the discourse structure of larger documents. We hypothesize that explicit discourse representations have utility for NLP tasks over longer documents or document sequences, which sequence-to-sequence models are unable to capture. For abstractive summarization, for instance, conventional neural models simply match source documents and the summary in a latent space without explicit representation of text structure or relations. In this paper, we propose to use neural discourse representations obtained from a rhetorical structure theory (RST) parser to enhance document representations. Specifically, document representations are generated for discourse spans, known as the elementary discourse units (EDUs). We empirically investigate the benefit of the proposed approach on two different tasks: abstractive summarization and popularity prediction of online petitions. We find that the proposed approach leads to improvements in all cases.
Tasks	Abstractive Text Summarization, Text Generation
Published	2019-11-16
URL	https://arxiv.org/abs/1911.06919v1
PDF	https://arxiv.org/pdf/1911.06919v1.pdf
PWC	https://paperswithcode.com/paper/improved-document-modelling-with-a-neural
Repo
Framework

Segmentation of Lumen and External Elastic Laminae in Intravascular Ultrasound Images using Ultrasonic Backscattering Physics Initialized Multiscale Random Walks


Title	Segmentation of Lumen and External Elastic Laminae in Intravascular Ultrasound Images using Ultrasonic Backscattering Physics Initialized Multiscale Random Walks
Authors	Debarghya China, Pabitra Mitra, Debdoot Sheet
Abstract	Coronary artery disease accounts for a large number of deaths across the world and clinicians generally prefer using x-ray computed tomography or magnetic resonance imaging for localizing vascular pathologies. Interventional imaging modalities like intravascular ultrasound (IVUS) are used to adjunct diagnosis of atherosclerotic plaques in vessels, and help assess morphological state of the vessel and plaque, which play a significant role for treatment planning. Since speckle intensity in IVUS images are inherently stochastic in nature and challenge clinicians with accurate visibility of the vessel wall boundaries, it requires automation. In this paper we present a method for segmenting the lumen and external elastic laminae of the artery wall in IVUS images using random walks over a multiscale pyramid of Gaussian decomposed frames. The seeds for the random walker are initialized by supervised learning of ultrasonic backscattering and attenuation statistical mechanics from labelled training samples. We have experimentally evaluated the performance using $77$ IVUS images acquired at $40$ MHz that are available in the IVUS segmentation challenge dataset\footnote{http://www.cvc.uab.es/IVUSchallenge2011/dataset.html} to obtain a Jaccard score of $0.89 \pm 0.14$ for lumen and $0.85 \pm 0.12$ for external elastic laminae segmentation over a $10$-fold cross-validation study.
Tasks
Published	2019-01-21
URL	http://arxiv.org/abs/1901.06926v1
PDF	http://arxiv.org/pdf/1901.06926v1.pdf
PWC	https://paperswithcode.com/paper/segmentation-of-lumen-and-external-elastic
Repo
Framework

Analysis of Software Engineering for Agile Machine Learning Projects


Title	Analysis of Software Engineering for Agile Machine Learning Projects
Authors	Kushal Singla, Joy Bose, Chetan Naik
Abstract	The number of machine learning, artificial intelligence or data science related software engineering projects using Agile methodology is increasing. However, there are very few studies on how such projects work in practice. In this paper, we analyze project issues tracking data taken from Scrum (a popular tool for Agile) for several machine learning projects. We compare this data with corresponding data from non-machine learning projects, in an attempt to analyze how machine learning projects are executed differently from normal software engineering projects. On analysis, we find that machine learning project issues use different kinds of words to describe issues, have higher number of exploratory or research oriented tasks as compared to implementation tasks, and have a higher number of issues in the product backlog after each sprint, denoting that it is more difficult to estimate the duration of machine learning project related tasks in advance. After analyzing this data, we propose a few ways in which Agile machine learning projects can be better logged and executed, given their differences with normal software engineering projects.
Tasks
Published	2019-12-16
URL	https://arxiv.org/abs/1912.07323v1
PDF	https://arxiv.org/pdf/1912.07323v1.pdf
PWC	https://paperswithcode.com/paper/analysis-of-software-engineering-for-agile
Repo
Framework

Ludwig: a type-based declarative deep learning toolbox


Title	Ludwig: a type-based declarative deep learning toolbox
Authors	Piero Molino, Yaroslav Dudin, Sai Sumanth Miryala
Abstract	In this work we present Ludwig, a flexible, extensible and easy to use toolbox which allows users to train deep learning models and use them for obtaining predictions without writing code. Ludwig implements a novel approach to deep learning model building based on two main abstractions: data types and declarative configuration files. The data type abstraction allows for easier code and sub-model reuse, and the standardized interfaces imposed by this abstraction allow for encapsulation and make the code easy to extend. Declarative model definition configuration files enable inexperienced users to obtain effective models and increase the productivity of expert users. Alongside these two innovations, Ludwig introduces a general modularized deep learning architecture called Encoder-Combiner-Decoder that can be instantiated to perform a vast amount of machine learning tasks. These innovations make it possible for engineers, scientists from other fields and, in general, a much broader audience to adopt deep learning models for their tasks, concretely helping in its democratization.
Tasks
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07930v1
PDF	https://arxiv.org/pdf/1909.07930v1.pdf
PWC	https://paperswithcode.com/paper/ludwig-a-type-based-declarative-deep-learning
Repo
Framework

Deep Learning for Prostate Pathology


Title	Deep Learning for Prostate Pathology
Authors	Okyaz Eminaga, Yuri Tolkach, Christian Kunder, Mahmood Abbas, Ryan Han, Rosalie Nolley, Axel Semjonow, Martin Boegemann, Sebastian Huss, Andreas Loening, Robert West, Geoffrey Sonn, Richard Fan, Olaf Bettendorf, James Brook, Daniel Rubin
Abstract	The current study detects different morphologies related to prostate pathology using deep learning models; these models were evaluated on 2,121 hematoxylin and eosin (H&E) stain histology images captured using bright field microscopy, which spanned a variety of image qualities, origins (whole slide, tissue micro array, whole mount, Internet), scanning machines, timestamps, H&E staining protocols, and institutions. For case usage, these models were applied for the annotation tasks in clinician-oriented pathology reports for prostatectomy specimens. The true positive rate (TPR) for slides with prostate cancer was 99.7% by a false positive rate of 0.785%. The F1-scores of Gleason patterns reported in pathology reports ranged from 0.795 to 1.0 at the case level. TPR was 93.6% for the cribriform morphology and 72.6% for the ductal morphology. The correlation between the ground truth and the prediction for the relative tumor volume was 0.987 n. Our models cover the major components of prostate pathology and successfully accomplish the annotation tasks.
Tasks
Published	2019-10-11
URL	https://arxiv.org/abs/1910.04918v3
PDF	https://arxiv.org/pdf/1910.04918v3.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-prostate-pathology
Repo
Framework

No Permanent Friends or Enemies: Tracking Relationships between Nations from News


Title	No Permanent Friends or Enemies: Tracking Relationships between Nations from News
Authors	Xiaochuang Han, Eunsol Choi, Chenhao Tan
Abstract	Understanding the dynamics of international politics is important yet challenging for civilians. In this work, we explore unsupervised neural models to infer relations between nations from news articles. We extend existing models by incorporating shallow linguistics information and propose a new automatic evaluation metric that aligns relationship dynamics with manually annotated key events. As understanding international relations requires carefully analyzing complex relationships, we conduct in-person human evaluations with three groups of participants. Overall, humans prefer the outputs of our model and give insightful feedback that suggests future directions for human-centered models. Furthermore, our model reveals interesting regional differences in news coverage. For instance, with respect to US-China relations, Singaporean media focus more on “strengthening” and “purchasing”, while US media focus more on “criticizing” and “denouncing”.
Tasks
Published	2019-04-18
URL	http://arxiv.org/abs/1904.08950v1
PDF	http://arxiv.org/pdf/1904.08950v1.pdf
PWC	https://paperswithcode.com/paper/no-permanent-friends-or-enemies-tracking
Repo
Framework

Diverse Exploration via Conjugate Policies for Policy Gradient Methods


Title	Diverse Exploration via Conjugate Policies for Policy Gradient Methods
Authors	Andrew Cohen, Xingye Qiao, Lei Yu, Elliot Way, Xiangrong Tong
Abstract	We address the challenge of effective exploration while maintaining good performance in policy gradient methods. As a solution, we propose diverse exploration (DE) via conjugate policies. DE learns and deploys a set of conjugate policies which can be conveniently generated as a byproduct of conjugate gradient descent. We provide both theoretical and empirical results showing the effectiveness of DE at achieving exploration, improving policy performance, and the advantage of DE over exploration by random policy perturbations.
Tasks	Policy Gradient Methods
Published	2019-02-10
URL	http://arxiv.org/abs/1902.03633v1
PDF	http://arxiv.org/pdf/1902.03633v1.pdf
PWC	https://paperswithcode.com/paper/diverse-exploration-via-conjugate-policies
Repo
Framework

Mitigating Deep Learning Vulnerabilities from Adversarial Examples Attack in the Cybersecurity Domain


Title	Mitigating Deep Learning Vulnerabilities from Adversarial Examples Attack in the Cybersecurity Domain
Authors	Chris Einar San Agustin
Abstract	Deep learning models are known to solve classification and regression problems by employing a number of epoch and training samples on a large dataset with optimal accuracy. However, that doesn’t mean they are attack-proof or unexposed to vulnerabilities. Newly deployed systems particularly on a public environment (i.e public networks) are vulnerable to attacks from various entities. Moreover, published research on deep learning systems (Goodfellow et al., 2014) have determined a significant number of attacks points and a wide array of attack surface that has evidence of exploitation from adversarial examples. Successful exploit on these systems could lead to critical real world repercussions. For instance, (1) an adversarial attack on a self-driving car running a deep reinforcement learning system yields a direct misclassification on humans causing untoward accidents.(2) a self-driving vehicle misreading a red light signal may cause the car to crash to another car (3) misclassification of a pedestrian lane as an intersection lane that could lead to car crashes. This is just the tip of the iceberg, computer vision deployment are not entirely focused on self-driving cars but on many other areas as well - that would have definitive impact on the real-world. These vulnerabilities must be mitigated at an early stage of development. It is imperative to develop and implement baseline security standards at a global level prior to real-world deployment.
Tasks	Adversarial Attack, Self-Driving Cars
Published	2019-05-09
URL	https://arxiv.org/abs/1905.03517v1
PDF	https://arxiv.org/pdf/1905.03517v1.pdf
PWC	https://paperswithcode.com/paper/190503517
Repo
Framework

How Compact?: Assessing Compactness of Representations through Layer-Wise Pruning


Title	How Compact?: Assessing Compactness of Representations through Layer-Wise Pruning
Authors	Hyun-Joo Jung, Jaedeok Kim, Yoonsuck Choe
Abstract	Various forms of representations may arise in the many layers embedded in deep neural networks (DNNs). Of these, where can we find the most compact representation? We propose to use a pruning framework to answer this question: How compact can each layer be compressed, without losing performance? Most of the existing DNN compression methods do not consider the relative compressibility of the individual layers. They uniformly apply a single target sparsity to all layers or adapt layer sparsity using heuristics and additional training. We propose a principled method that automatically determines the sparsity of individual layers derived from the importance of each layer. To do this, we consider a metric to measure the importance of each layer based on the layer-wise capacity. Given the trained model and the total target sparsity, we first evaluate the importance of each layer from the model. From the evaluated importance, we compute the layer-wise sparsity of each layer. The proposed method can be applied to any DNN architecture and can be combined with any pruning method that takes the total target sparsity as a parameter. To validate the proposed method, we carried out an image classification task with two types of DNN architectures on two benchmark datasets and used three pruning methods for compression. In case of VGG-16 model with weight pruning on the ImageNet dataset, we achieved up to 75% (17.5% on average) better top-5 accuracy than the baseline under the same total target sparsity. Furthermore, we analyzed where the maximum compression can occur in the network. This kind of analysis can help us identify the most compact representation within a deep neural network.
Tasks	Image Classification
Published	2019-01-09
URL	http://arxiv.org/abs/1901.02757v1
PDF	http://arxiv.org/pdf/1901.02757v1.pdf
PWC	https://paperswithcode.com/paper/how-compact-assessing-compactness-of
Repo
Framework

Efficient Hybrid Network Architectures for Extremely Quantized Neural Networks Enabling Intelligence at the Edge


Title	Efficient Hybrid Network Architectures for Extremely Quantized Neural Networks Enabling Intelligence at the Edge
Authors	Indranil Chakraborty, Deboleena Roy, Aayush Ankit, Kaushik Roy
Abstract	The recent advent of `Internet of Things’ (IOT) has increased the demand for enabling AI-based edge computing. This has necessitated the search for efficient implementations of neural networks in terms of both computations and storage. Although extreme quantization has proven to be a powerful tool to achieve significant compression over full-precision networks, it can result in significant degradation in performance. In this work, we propose extremely quantized hybrid network architectures with both binary and full-precision sections to emulate the classification performance of full-precision networks while ensuring significant energy efficiency and memory compression. We explore several hybrid network architectures and analyze the performance of the networks in terms of accuracy, energy efficiency and memory compression. We perform our analysis on ResNet and VGG network architectures. Among the proposed network architectures, we show that the hybrid networks with full-precision residual connections emerge as the optimum by attaining accuracies close to full-precision networks while achieving excellent memory compression, up to 21.8x in case of VGG-19. This work demonstrates an effective way of hybridizing networks which achieve performance close to full-precision networks while attaining significant compression, furthering the feasibility of using such networks for energy-efficient neural computing in IOT-based edge devices. \|
Tasks	Quantization
Published	2019-02-01
URL	http://arxiv.org/abs/1902.00460v1
PDF	http://arxiv.org/pdf/1902.00460v1.pdf
PWC	https://paperswithcode.com/paper/efficient-hybrid-network-architectures-for
Repo
Framework

Challenge of Spatial Cognition for Deep Learning


Title	Challenge of Spatial Cognition for Deep Learning
Authors	Xiaolin Wu, Xi Zhang, Jun Du
Abstract	Given the success of the deep convolutional neural networks (DCNNs) in applications of visual recognition and classification, it would be tantalizing to test if DCNNs can also learn spatial concepts, such as straightness, convexity, left/right, front/back, relative size, aspect ratio, polygons, etc., from varied visual examples of these concepts that are simple and yet vital for spatial reasoning. Much to our dismay, extensive experiments of the type of cognitive psychology demonstrate that the data-driven deep learning (DL) cannot see through superficial variations in visual representations and grasp the spatial concept in abstraction. The root cause of failure turns out to be the learning methodology, not the computational model of the neural network itself. By incorporating task-specific convolutional kernels, we are able to construct DCNNs for spatial cognition tasks that can generalize to input images not drawn from the same distribution of the training set. This work raises a precaution that without manually-incorporated priors or features DCCNs may fail spatial cognitive tasks at rudimentary level.
Tasks
Published	2019-07-30
URL	https://arxiv.org/abs/1908.04396v1
PDF	https://arxiv.org/pdf/1908.04396v1.pdf
PWC	https://paperswithcode.com/paper/challenge-of-spatial-cognition-for-deep
Repo
Framework

Regularising Deep Networks with Deep Generative Models


Title	Regularising Deep Networks with Deep Generative Models
Authors	Matthew Willetts, Alexander Camuto, Stephen Roberts, Chris Holmes
Abstract	We develop a new method for regularising neural networks. We learn a probability distribution over the activations of all layers of the model and then insert imputed values into the network during training. We obtain a posterior for an arbitrary subset of activations conditioned on the remainder. This is a generalisation of data augmentation to the hidden layers of a network, and a form of data-aware dropout. We demonstrate that our training method leads to higher test accuracy and lower test-set cross-entropy for neural networks trained on CIFAR-10 and SVHN compared to standard regularisation baselines: our approach leads to networks with better calibrated uncertainty over the class posteriors all the while delivering greater test-set accuracy.
Tasks	Data Augmentation, Imputation
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11507v2
PDF	https://arxiv.org/pdf/1909.11507v2.pdf
PWC	https://paperswithcode.com/paper/regularising-deep-networks-with-dgms
Repo
Framework

Robust Model-free Reinforcement Learning with Multi-objective Bayesian Optimization


Title	Robust Model-free Reinforcement Learning with Multi-objective Bayesian Optimization
Authors	Matteo Turchetta, Andreas Krause, Sebastian Trimpe
Abstract	In reinforcement learning (RL), an autonomous agent learns to perform complex tasks by maximizing an exogenous reward signal while interacting with its environment. In real-world applications, test conditions may differ substantially from the training scenario and, therefore, focusing on pure reward maximization during training may lead to poor results at test time. In these cases, it is important to trade-off between performance and robustness while learning a policy. While several results exist for robust, model-based RL, the model-free case has not been widely investigated. In this paper, we cast the robust, model-free RL problem as a multi-objective optimization problem. To quantify the robustness of a policy, we use delay margin and gain margin, two robustness indicators that are common in control theory. We show how these metrics can be estimated from data in the model-free setting. We use multi-objective Bayesian optimization (MOBO) to solve efficiently this expensive-to-evaluate, multi-objective optimization problem. We show the benefits of our robust formulation both in sim-to-real and pure hardware experiments to balance a Furuta pendulum.
Tasks
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13399v1
PDF	https://arxiv.org/pdf/1910.13399v1.pdf
PWC	https://paperswithcode.com/paper/191013399
Repo
Framework

Online Causal Structure Learning in the Presence of Latent Variables


Title	Online Causal Structure Learning in the Presence of Latent Variables
Authors	Durdane Kocacoban, James Cussens
Abstract	We present two online causal structure learning algorithms which can track changes in a causal structure and process data in a dynamic real-time manner. Standard causal structure learning algorithms assume that causal structure does not change during the data collection process, but in real-world scenarios, it does often change. Therefore, it is inappropriate to handle such changes with existing batch-learning approaches, and instead, a structure should be learned in an online manner. The online causal structure learning algorithms we present here can revise correlation values without reprocessing the entire dataset and use an existing model to avoid relearning the causal links in the prior model, which still fit data. Proposed algorithms are tested on synthetic and real-world datasets, the latter being a seasonally adjusted commodity price index dataset for the U.S. The online causal structure learning algorithms outperformed standard FCI by a large margin in learning the changed causal structure correctly and efficiently when latent variables were present.
Tasks
Published	2019-04-30
URL	https://arxiv.org/abs/1904.13247v2
PDF	https://arxiv.org/pdf/1904.13247v2.pdf
PWC	https://paperswithcode.com/paper/online-causal-structure-learning-in-the
Repo
Framework

Identification of synoptic weather types over Taiwan area with multiple classifiers


Title	Identification of synoptic weather types over Taiwan area with multiple classifiers
Authors	Shih-Hao Su, Jung-Lien Chu, Ting-Shuo Yo, Lee-Yaw Lin
Abstract	In this study, a novel machine learning approach was used to classify three types of synoptic weather events in Taiwan area from 2001 to 2010. We used reanalysis data with three machine learning algorithms to recognize weather systems and evaluated their performance. Overall, the classifiers successfully identified 52-83% of weather events (hit rate), which is higher than the performance of traditional objective methods. The results showed that the machine learning approach gave low false alarm rate in general, while the support vector machine (SVM) with more principal components of reanalysis data had higher hit rate on all tested weather events. The sensitivity tests of grid data resolution indicated that the differences between the high- and low-resolution datasets are limited, which implied that the proposed method can achieve reasonable performance in weather forecasting with minimal resources. By identifying daily weather systems in historical reanalysis data, this method can be used to study long-term weather changes, to monitor climatological-scale variations, and to provide a better estimate of climate projections. Furthermore, this method can also serve as an alternative to model output statistics and potentially be used for synoptic weather forecasting.
Tasks	Weather Forecasting
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08736v1
PDF	https://arxiv.org/pdf/1905.08736v1.pdf
PWC	https://paperswithcode.com/paper/identification-of-synoptic-weather-types-over
Repo
Framework