April 1, 2020

3330 words 16 mins read

Paper Group ANR 514

Paper Group ANR 514

A Corpus for Detecting High-Context Medical Conditions in Intensive Care Patient Notes Focusing on Frequently Readmitted Patients. Evolving Loss Functions with Multivariate Taylor Polynomial Parameterizations. Learning Optimal Classification Trees: Strong Max-Flow Formulations. Deep RBF Value Functions for Continuous Control. A Graph-Based Platform …

A Corpus for Detecting High-Context Medical Conditions in Intensive Care Patient Notes Focusing on Frequently Readmitted Patients

Title A Corpus for Detecting High-Context Medical Conditions in Intensive Care Patient Notes Focusing on Frequently Readmitted Patients
Authors Edward T. Moseley, Joy T. Wu, Jonathan Welt, John Foote, Patrick D. Tyler, David W. Grant, Eric T. Carlson, Sebastian Gehrmann, Franck Dernoncourt, Leo Anthony Celi
Abstract A crucial step within secondary analysis of electronic health records (EHRs) is to identify the patient cohort under investigation. While EHRs contain medical billing codes that aim to represent the conditions and treatments patients may have, much of the information is only present in the patient notes. Therefore, it is critical to develop robust algorithms to infer patients’ conditions and treatments from their written notes. In this paper, we introduce a dataset for patient phenotyping, a task that is defined as the identification of whether a patient has a given medical condition (also referred to as clinical indication or phenotype) based on their patient note. Nursing Progress Notes and Discharge Summaries from the Intensive Care Unit of a large tertiary care hospital were manually annotated for the presence of several high-context phenotypes relevant to treatment and risk of re-hospitalization. This dataset contains 1102 Discharge Summaries and 1000 Nursing Progress Notes. Each Discharge Summary and Progress Note has been annotated by at least two expert human annotators (one clinical researcher and one resident physician). Annotated phenotypes include treatment non-adherence, chronic pain, advanced/metastatic cancer, as well as 10 other phenotypes. This dataset can be utilized for academic and industrial research in medicine and computer science, particularly within the field of medical natural language processing.
Published 2020-03-06
URL https://arxiv.org/abs/2003.03044v1
PDF https://arxiv.org/pdf/2003.03044v1.pdf
PWC https://paperswithcode.com/paper/a-corpus-for-detecting-high-context-medical

Evolving Loss Functions with Multivariate Taylor Polynomial Parameterizations

Title Evolving Loss Functions with Multivariate Taylor Polynomial Parameterizations
Authors Santiago Gonzalez, Risto Miikkulainen
Abstract Loss function optimization for neural networks has recently emerged as a new direction for metalearning, with Genetic Loss Optimization (GLO) providing a general approach for the discovery and optimization of such functions. GLO represents loss functions as trees that are evolved and further optimized using evolutionary strategies. However, searching in this space is difficult because most candidates are not valid loss functions. In this paper, a new technique, Multivariate Taylor expansion-based genetic loss-function optimization (TaylorGLO), is introduced to solve this problem. It represents functions using a novel parameterization based on Taylor expansions, making the search more effective. TaylorGLO is able to find new loss functions that outperform those found by GLO in many fewer generations, demonstrating that loss function optimization is a productive avenue for metalearning.
Published 2020-01-31
URL https://arxiv.org/abs/2002.00059v2
PDF https://arxiv.org/pdf/2002.00059v2.pdf
PWC https://paperswithcode.com/paper/evolving-loss-functions-with-multivariate

Learning Optimal Classification Trees: Strong Max-Flow Formulations

Title Learning Optimal Classification Trees: Strong Max-Flow Formulations
Authors Sina Aghaei, Andres Gomez, Phebe Vayanos
Abstract We consider the problem of learning optimal binary classification trees. Literature on the topic has burgeoned in recent years, motivated both by the empirical suboptimality of heuristic approaches and the tremendous improvements in mixed-integer programming (MIP) technology. Yet, existing approaches from the literature do not leverage the power of MIP to its full extent. Indeed, they rely on weak formulations, resulting in slow convergence and large optimality gaps. To fill this gap in the literature, we propose a flow-based MIP formulation for optimal binary classification trees that has a stronger linear programming relaxation. Our formulation presents an attractive decomposable structure. We exploit this structure and max-flow/min-cut duality to derive a Benders’ decomposition method, which scales to larger instances. We conduct extensive computational experiments on standard benchmark datasets on which we show that our proposed approaches are 50 times faster than state-of-the art MIP-based techniques and improve out of sample performance up to 13.8%.
Published 2020-02-21
URL https://arxiv.org/abs/2002.09142v1
PDF https://arxiv.org/pdf/2002.09142v1.pdf
PWC https://paperswithcode.com/paper/learning-optimal-classification-trees-strong

Deep RBF Value Functions for Continuous Control

Title Deep RBF Value Functions for Continuous Control
Authors Kavosh Asadi, Ronald E. Parr, George D. Konidaris, Michael L. Littman
Abstract A core operation in reinforcement learning (RL) is finding an action that is optimal with respect to a learned state-action value function. This operation is often challenging when the learned value function takes continuous actions as input. We introduce deep RBF value functions: state-action value functions learned using a deep neural network with a radial-basis function (RBF) output layer. We show that the optimal action with respect to a deep RBF value function can be easily approximated up to any desired accuracy. Moreover, deep RBF value functions can represent any true value function up to any desired accuracy owing to their support for universal function approximation. By learning a deep RBF value function, we extend the standard DQN algorithm to continuous control, and demonstrate that the resultant agent, RBF-DQN, outperforms standard baselines on a set of continuous-action RL problems.
Tasks Continuous Control
Published 2020-02-05
URL https://arxiv.org/abs/2002.01883v1
PDF https://arxiv.org/pdf/2002.01883v1.pdf
PWC https://paperswithcode.com/paper/deep-rbf-value-functions-for-continuous

A Graph-Based Platform for Customer Behavior Analysis using Applications’ Clickstream Data

Title A Graph-Based Platform for Customer Behavior Analysis using Applications’ Clickstream Data
Authors Mojgan Mohajer
Abstract Clickstream analysis is getting more attention since the increase of usage in e-commerce and applications. Beside customers’ purchase behavior analysis, there is also attempt to analyze the customer behavior in relation to the quality of web or application design. In general, clickstream data can be considered as a sequence of log events collected at different levels of web/app usage. The analysis of clickstream data can be performed directly as sequence analysis or by extracting features from sequences. In this work, we show how representing and saving the sequences with their underlying graph structures can induce a platform for customer behavior analysis. Our main idea is that clickstream data containing sequences of actions of an application, are walks of the corresponding finite state automaton (FSA) of that application. Our hypothesis is that the customers of an application normally do not use all possible walks through that FSA and the number of actual walks is much smaller than total number of possible walks through the FSA. Sequences of such a walk normally consist of a finite number of cycles on FSA graphs. Identifying and matching these cycles in the classical sequence analysis is not straight forward. We show that representing the sequences through their underlying graph structures not only groups the sequences automatically but also provides a compressed data representation of the original sequences.
Published 2020-02-20
URL https://arxiv.org/abs/2002.10269v1
PDF https://arxiv.org/pdf/2002.10269v1.pdf
PWC https://paperswithcode.com/paper/a-graph-based-platform-for-customer-behavior

Backtracking Gradient Descent allowing unbounded learning rates

Title Backtracking Gradient Descent allowing unbounded learning rates
Authors Tuyen Trung Truong
Abstract In unconstrained optimisation on an Euclidean space, to prove convergence in Gradient Descent processes (GD) $x_{n+1}=x_n-\delta _n \nabla f(x_n)$ it usually is required that the learning rates $\delta _n$'s are bounded: $\delta _n\leq \delta $ for some positive $\delta $. Under this assumption, if the sequence $x_n$ converges to a critical point $z$, then with large values of $n$ the update will be small because $x_{n+1}-x_n\lesssim \nabla f(x_n)$. This may also force the sequence to converge to a bad minimum. If we can allow, at least theoretically, that the learning rates $\delta _n$'s are not bounded, then we may have better convergence to better minima. A previous joint paper by the author showed convergence for the usual version of Backtracking GD under very general assumptions on the cost function $f$. In this paper, we allow the learning rates $\delta _n$ to be unbounded, in the sense that there is a function $h:(0,\infty)\rightarrow (0,\infty )$ such that $\lim _{t\rightarrow 0}th(t)=0$ and $\delta _n\lesssim \max {h(x_n),\delta }$ satisfies Armijo’s condition for all $n$, and prove convergence under the same assumptions as in the mentioned paper. It will be shown that this growth rate of $h$ is best possible if one wants convergence of the sequence ${x_n}$. A specific way for choosing $\delta _n$ in a discrete way connects to Two-way Backtracking GD defined in the mentioned paper. We provide some results which either improve or are implicitly contained in those in the mentioned paper and another recent paper on avoidance of saddle points.
Published 2020-01-07
URL https://arxiv.org/abs/2001.02005v2
PDF https://arxiv.org/pdf/2001.02005v2.pdf
PWC https://paperswithcode.com/paper/backtracking-gradient-descent-allowing

Big-Data Science in Porous Materials: Materials Genomics and Machine Learning

Title Big-Data Science in Porous Materials: Materials Genomics and Machine Learning
Authors Kevin Maik Jablonka, Daniele Ongari, Seyed Mohamad Moosavi, Berend Smit
Abstract By combining metal nodes with organic linkers we can potentially synthesize millions of possible metal organic frameworks (MOFs). At present, we have libraries of over ten thousand synthesized materials and millions of in-silico predicted materials. The fact that we have so many materials opens many exciting avenues to tailor make a material that is optimal for a given application. However, from an experimental and computational point of view we simply have too many materials to screen using brute-force techniques. In this review, we show that having so many materials allows us to use big-data methods as a powerful technique to study these materials and to discover complex correlations. The first part of the review gives an introduction to the principles of big-data science. We emphasize the importance of data collection, methods to augment small data sets, how to select appropriate training sets. An important part of this review are the different approaches that are used to represent these materials in feature space. The review also includes a general overview of the different ML techniques, but as most applications in porous materials use supervised ML our review is focused on the different approaches for supervised ML. In particular, we review the different method to optimize the ML process and how to quantify the performance of the different methods. In the second part, we review how the different approaches of ML have been applied to porous materials. In particular, we discuss applications in the field of gas storage and separation, the stability of these materials, their electronic properties, and their synthesis. The range of topics illustrates the large variety of topics that can be studied with big-data science. Given the increasing interest of the scientific community in ML, we expect this list to rapidly expand in the coming years.
Published 2020-01-18
URL https://arxiv.org/abs/2001.06728v2
PDF https://arxiv.org/pdf/2001.06728v2.pdf
PWC https://paperswithcode.com/paper/big-data-science-in-porous-materials

Physics-informed deep learning for incompressible laminar flows

Title Physics-informed deep learning for incompressible laminar flows
Authors Chengping Rao, Hao Sun, Yang Liu
Abstract Physics-informed deep learning (PIDL) has drawn tremendous interest in recent years to solve computational physics problems. The basic concept of PIDL is to embed available physical laws to constrain/inform neural networks, with the need of less rich data for training a reliable model. This can be achieved by incorporating the residual of the partial differential equations and the initial/boundary conditions into the loss function. Through minimizing the loss function, the neural network would be able to approximate the solution to the physical field of interest. In this paper, we propose a mixed-variable scheme of physics-informed neural network (PINN) for fluid dynamics and apply it to simulate steady and transient laminar flows at low Reynolds numbers. The predicted velocity and pressure fields by the proposed PINN approach are compared with the reference numerical solutions. Simulation results demonstrate great potential of the proposed PINN for fluid flow simulation with a high accuracy.
Published 2020-02-24
URL https://arxiv.org/abs/2002.10558v1
PDF https://arxiv.org/pdf/2002.10558v1.pdf
PWC https://paperswithcode.com/paper/physics-informed-deep-learning-for

Goal-Oriented Multi-Task BERT-Based Dialogue State Tracker

Title Goal-Oriented Multi-Task BERT-Based Dialogue State Tracker
Authors Pavel Gulyaev, Eugenia Elistratova, Vasily Konovalov, Yuri Kuratov, Leonid Pugachev, Mikhail Burtsev
Abstract Dialogue State Tracking (DST) is a core component of virtual assistants such as Alexa or Siri. To accomplish various tasks, these assistants need to support an increasing number of services and APIs. The Schema-Guided State Tracking track of the 8th Dialogue System Technology Challenge highlighted the DST problem for unseen services. The organizers introduced the Schema-Guided Dialogue (SGD) dataset with multi-domain conversations and released a zero-shot dialogue state tracking model. In this work, we propose a GOaL-Oriented Multi-task BERT-based dialogue state tracker (GOLOMB) inspired by architectures for reading comprehension question answering systems. The model “queries” dialogue history with descriptions of slots and services as well as possible values of slots. This allows to transfer slot values in multi-domain dialogues and have a capability to scale to unseen slot types. Our model achieves a joint goal accuracy of 53.97% on the SGD dataset, outperforming the baseline model.
Tasks Dialogue State Tracking, Question Answering, Reading Comprehension
Published 2020-02-05
URL https://arxiv.org/abs/2002.02450v1
PDF https://arxiv.org/pdf/2002.02450v1.pdf
PWC https://paperswithcode.com/paper/goal-oriented-multi-task-bert-based-dialogue

Self-supervised Representation Learning for Ultrasound Video

Title Self-supervised Representation Learning for Ultrasound Video
Authors Jianbo Jiao, Richard Droste, Lior Drukker, Aris T. Papageorghiou, J. Alison Noble
Abstract Recent advances in deep learning have achieved promising performance for medical image analysis, while in most cases ground-truth annotations from human experts are necessary to train the deep model. In practice, such annotations are expensive to collect and can be scarce for medical imaging applications. Therefore, there is significant interest in learning representations from unlabelled raw data. In this paper, we propose a self-supervised learning approach to learn meaningful and transferable representations from medical imaging video without any type of human annotation. We assume that in order to learn such a representation, the model should identify anatomical structures from the unlabelled data. Therefore we force the model to address anatomy-aware tasks with free supervision from the data itself. Specifically, the model is designed to correct the order of a reshuffled video clip and at the same time predict the geometric transformation applied to the video clip. Experiments on fetal ultrasound video show that the proposed approach can effectively learn meaningful and strong representations, which transfer well to downstream tasks like standard plane detection and saliency prediction.
Tasks Representation Learning, Saliency Prediction
Published 2020-02-28
URL https://arxiv.org/abs/2003.00105v1
PDF https://arxiv.org/pdf/2003.00105v1.pdf
PWC https://paperswithcode.com/paper/self-supervised-representation-learning-for

Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement

Title Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement
Authors Benjamin Eysenbach, Xinyang Geng, Sergey Levine, Ruslan Salakhutdinov
Abstract Multi-task reinforcement learning (RL) aims to simultaneously learn policies for solving many tasks. Several prior works have found that relabeling past experience with different reward functions can improve sample efficiency. Relabeling methods typically ask: if, in hindsight, we assume that our experience was optimal for some task, for what task was it optimal? In this paper, we show that hindsight relabeling is inverse RL, an observation that suggests that we can use inverse RL in tandem for RL algorithms to efficiently solve many tasks. We use this idea to generalize goal-relabeling techniques from prior work to arbitrary classes of tasks. Our experiments confirm that relabeling data using inverse RL accelerates learning in general multi-task settings, including goal-reaching, domains with discrete sets of rewards, and those with linear reward functions.
Published 2020-02-25
URL https://arxiv.org/abs/2002.11089v1
PDF https://arxiv.org/pdf/2002.11089v1.pdf
PWC https://paperswithcode.com/paper/rewriting-history-with-inverse-rl-hindsight

Variational Hierarchical Dialog Autoencoder for Dialog State Tracking Data Augmentation

Title Variational Hierarchical Dialog Autoencoder for Dialog State Tracking Data Augmentation
Authors Kang Min Yoo, Hanbit Lee, Franck Dernoncourt, Trung Bui, Walter Chang, Sang-goo Lee
Abstract Recent works have shown that generative data augmentation, where synthetic samples generated from deep generative models are used to augment the training dataset, benefit certain NLP tasks. In this work, we extend this approach to the task of dialog state tracking for goal-oriented dialogs. Since, goal-oriented dialogs naturally exhibit a hierarchical structure over utterances and related annotations, deep generative data augmentation for the task requires the generative model to be aware of the hierarchical nature. We propose the Variational Hierarchical Dialog Autoencoder (VHDA) for modeling complete aspects of goal-oriented dialogs, including linguistic features and underlying structured annotations, namely dialog acts and goals. We also propose two training policies to mitigate issues that arise with training VAE-based models. Experiments show that our hierarchical model is able to generate realistic and novel samples that improve the robustness of state-of-the-art dialog state trackers, ultimately improving the dialog state tracking performances on various dialog domains. Surprisingly, the ability to jointly generate dialog features enables our model to outperform previous state-of-the-arts in related subtasks, such as language generation and user simulation.
Tasks Data Augmentation, Dialogue State Tracking, Text Generation
Published 2020-01-23
URL https://arxiv.org/abs/2001.08604v2
PDF https://arxiv.org/pdf/2001.08604v2.pdf
PWC https://paperswithcode.com/paper/variational-hierarchical-dialog-autoencoder

DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

Title DEPARA: Deep Attribution Graph for Deep Knowledge Transferability
Authors Jie Song, Yixin Chen, Jingwen Ye, Xinchao Wang, Chengchao Shen, Feng Mao, Mingli Song
Abstract Exploring the intrinsic interconnections between the knowledge encoded in PRe-trained Deep Neural Networks (PR-DNNs) of heterogeneous tasks sheds light on their mutual transferability, and consequently enables knowledge transfer from one task to another so as to reduce the training effort of the latter. In this paper, we propose the DEeP Attribution gRAph (DEPARA) to investigate the transferability of knowledge learned from PR-DNNs. In DEPARA, nodes correspond to the inputs and are represented by their vectorized attribution maps with regards to the outputs of the PR-DNN. Edges denote the relatedness between inputs and are measured by the similarity of their features extracted from the PR-DNN. The knowledge transferability of two PR-DNNs is measured by the similarity of their corresponding DEPARAs. We apply DEPARA to two important yet under-studied problems in transfer learning: pre-trained model selection and layer selection. Extensive experiments are conducted to demonstrate the effectiveness and superiority of the proposed method in solving both these problems. Code, data and models reproducing the results in this paper are available at \url{https://github.com/zju-vipa/DEPARA}.
Tasks Model Selection, Transfer Learning
Published 2020-03-17
URL https://arxiv.org/abs/2003.07496v1
PDF https://arxiv.org/pdf/2003.07496v1.pdf
PWC https://paperswithcode.com/paper/depara-deep-attribution-graph-for-deep

Segmentation-based Method combined with Dynamic Programming for Brain Midline Delineation

Title Segmentation-based Method combined with Dynamic Programming for Brain Midline Delineation
Authors Shen Wang, Kongming Liang, Chengwei Pan, Chuyang Ye, Xiuli Li, Feng Liu, Yizhou Yu, Yizhou Wang
Abstract The midline related pathological image features are crucial for evaluating the severity of brain compression caused by stroke or traumatic brain injury (TBI). The automated midline delineation not only improves the assessment and clinical decision making for patients with stroke symptoms or head trauma but also reduces the time of diagnosis. Nevertheless, most of the previous methods model the midline by localizing the anatomical points, which are hard to detect or even missing in severe cases. In this paper, we formulate the brain midline delineation as a segmentation task and propose a three-stage framework. The proposed framework firstly aligns an input CT image into the standard space. Then, the aligned image is processed by a midline detection network (MD-Net) integrated with the CoordConv Layer and Cascade AtrousCconv Module to obtain the probability map. Finally, we formulate the optimal midline selection as a pathfinding problem to solve the problem of the discontinuity of midline delineation. Experimental results show that our proposed framework can achieve superior performance on one in-house dataset and one public dataset.
Tasks Decision Making
Published 2020-02-27
URL https://arxiv.org/abs/2002.11918v1
PDF https://arxiv.org/pdf/2002.11918v1.pdf
PWC https://paperswithcode.com/paper/segmentation-based-method-combined-with

Technical report: Kidney tumor segmentation using a 2D U-Net followed by a statistical post-processing filter

Title Technical report: Kidney tumor segmentation using a 2D U-Net followed by a statistical post-processing filter
Authors Iwan Paolucci
Abstract Each year, there are about 400’000 new cases of kidney cancer worldwide causing around 175’000 deaths. For clinical decision making it is important to understand the morphometry of the tumor, which involves the time-consuming task of delineating tumor and kidney in 3D CT images. Automatic segmentation could be an important tool for clinicians and researchers to also study the correlations between tumor morphometry and clinical outcomes. We present a segmentation method which combines the popular U-Net convolutional neural network architecture with post-processing based on statistical constraints of the available training data. The full implementation, based on PyTorch, and the trained weights can be found on GitHub.
Tasks Decision Making
Published 2020-02-25
URL https://arxiv.org/abs/2002.10727v1
PDF https://arxiv.org/pdf/2002.10727v1.pdf
PWC https://paperswithcode.com/paper/technical-report-kidney-tumor-segmentation
comments powered by Disqus