April 3, 2020

2943 words 14 mins read

Paper Group ANR 80

Multi-objective Ranking via Constrained Optimization. A Framework for Online Investment Algorithms. Defense Through Diverse Directions. Parameter Sharing in Coagent Networks. Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles with Uncertainties. Getting Better from Worse: Augmented Bagging and a Cautionary Tale of Variabl …

Multi-objective Ranking via Constrained Optimization


Title	Multi-objective Ranking via Constrained Optimization
Authors	Michinari Momma, Alireza Bagheri Garakani, Nanxun Ma, Yi Sun
Abstract	In this paper, we introduce an Augmented Lagrangian based method to incorporate the multiple objectives (MO) in a search ranking algorithm. Optimizing MOs is an essential and realistic requirement for building ranking models in production. The proposed method formulates MO in constrained optimization and solves the problem in the popular Boosting framework – a novel contribution of our work. Furthermore, we propose a procedure to set up all optimization parameters in the problem. The experimental results show that the method successfully achieves MO criteria much more efficiently than existing methods.
Tasks
Published	2020-02-13
URL	https://arxiv.org/abs/2002.05753v1
PDF	https://arxiv.org/pdf/2002.05753v1.pdf
PWC	https://paperswithcode.com/paper/multi-objective-ranking-via-constrained
Repo
Framework

A Framework for Online Investment Algorithms


Title	A Framework for Online Investment Algorithms
Authors	Andrew Paskaramoorthy, Terence van Zyl, Tim Gebbie
Abstract	The artificial segmentation of an investment management process into a workflow with silos of offline human operators can restrict silos from collectively and adaptively pursuing a unified optimal investment goal. To meet the investor’s objectives, an online algorithm can provide an explicit incremental approach that makes sequential updates as data arrives at the process level. This is in stark contrast to offline (or batch) processes that are focused on making component level decisions prior to process level integration. Here we present and report results for an integrated, and online framework for algorithmic portfolio management. This article provides a workflow that can in-turn be embedded into a process level learning framework. The workflow can be enhanced to refine signal generation and asset-class evolution and definitions. Our results confirm that we can use our framework in conjunction with resampling methods to outperform naive market capitalisation benchmarks while making clear the extent of back-test over-fitting. We consider such an online update framework to be a crucial step towards developing intelligent portfolio selection algorithms that integrate financial theory, investor views, and data analysis with process-level learning.
Tasks
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13360v1
PDF	https://arxiv.org/pdf/2003.13360v1.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-online-investment-algorithms
Repo
Framework

Defense Through Diverse Directions


Title	Defense Through Diverse Directions
Authors	Christopher M. Bender, Yang Li, Yifeng Shi, Michael K. Reiter, Junier B. Oliva
Abstract	In this work we develop a novel Bayesian neural network methodology to achieve strong adversarial robustness without the need for online adversarial training. Unlike previous efforts in this direction, we do not rely solely on the stochasticity of network weights by minimizing the divergence between the learned parameter distribution and a prior. Instead, we additionally require that the model maintain some expected uncertainty with respect to all input covariates. We demonstrate that by encouraging the network to distribute evenly across inputs, the network becomes less susceptible to localized, brittle features which imparts a natural robustness to targeted perturbations. We show empirical robustness on several benchmark datasets.
Tasks
Published	2020-03-24
URL	https://arxiv.org/abs/2003.10602v1
PDF	https://arxiv.org/pdf/2003.10602v1.pdf
PWC	https://paperswithcode.com/paper/defense-through-diverse-directions
Repo
Framework


Title	Parameter Sharing in Coagent Networks
Authors	Modjtaba Shokrian Zini
Abstract	In this paper, we aim to prove the theorem that generalizes the Coagent Network Policy Gradient Theorem (Kostas et. al., 2019) to the context where parameters are shared among the function approximators involved. This provides the theoretical foundation to use any pattern of parameter sharing and leverage the freedom in the graph structure of the network to possibility exploit relational bias in a given task. As another application, we will apply our result to give a more intuitive proof for the Hierarchical Option Critic Policy Gradient Theorem, first shown in (Riemer et. al., 2019).
Tasks
Published	2020-01-28
URL	https://arxiv.org/abs/2001.10474v1
PDF	https://arxiv.org/pdf/2001.10474v1.pdf
PWC	https://paperswithcode.com/paper/parameter-sharing-in-coagent-networks
Repo
Framework

Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles with Uncertainties


Title	Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles with Uncertainties
Authors	Qingrui Zhang, Wei Pan, Vasso Reppa
Abstract	This paper presents a novel model-reference reinforcement learning control method for uncertain autonomous surface vehicles. The proposed control combines a conventional control method with deep reinforcement learning. With the conventional control, we can ensure the learning-based control law provides closed-loop stability for the overall system, and potentially increase the sample efficiency of the deep reinforcement learning. With the reinforcement learning, we can directly learn a control law to compensate for modeling uncertainties. In the proposed control, a nominal system is employed for the design of a baseline control law using a conventional control approach. The nominal system also defines the desired performance for uncertain autonomous vehicles to follow. In comparison with traditional deep reinforcement learning methods, our proposed learning-based control can provide stability guarantees and better sample efficiency. We demonstrate the performance of the new algorithm via extensive simulation results.
Tasks	Autonomous Vehicles
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13839v1
PDF	https://arxiv.org/pdf/2003.13839v1.pdf
PWC	https://paperswithcode.com/paper/model-reference-reinforcement-learning
Repo
Framework

Getting Better from Worse: Augmented Bagging and a Cautionary Tale of Variable Importance


Title	Getting Better from Worse: Augmented Bagging and a Cautionary Tale of Variable Importance
Authors	Lucas Mentch, Siyu Zhou
Abstract	As the size, complexity, and availability of data continues to grow, scientists are increasingly relying upon black-box learning algorithms that can often provide accurate predictions with minimal a priori model specifications. Tools like random forest have an established track record of off-the-shelf success and even offer various strategies for analyzing the underlying relationships between features and the response. Motivated by recent insights into random forest behavior, here we introduce the idea of augmented bagging (AugBagg), a procedure that operates in an identical fashion to the classical bagging and random forest counterparts but which operates on a larger space containing additional, randomly generated features. Somewhat surprisingly, we demonstrate that the simple act of adding additional random features into the model can have a dramatic beneficial effect on performance, sometimes outperforming even an optimally tuned traditional random forest. This finding that the inclusion of an additional set of features generated independently of the response can considerably improve predictive performance has crucial implications for the manner in which we consider and measure variable importance. Numerous demonstrations on both real and synthetic data are provided.
Tasks
Published	2020-03-07
URL	https://arxiv.org/abs/2003.03629v1
PDF	https://arxiv.org/pdf/2003.03629v1.pdf
PWC	https://paperswithcode.com/paper/getting-better-from-worse-augmented-bagging
Repo
Framework

Deep Representation Learning of Electronic Health Records to Unlock Patient Stratification at Scale


Title	Deep Representation Learning of Electronic Health Records to Unlock Patient Stratification at Scale
Authors	Isotta Landi, Benjamin S. Glicksberg, Hao-Chih Lee, Sarah Cherng, Giulia Landi, Matteo Danieletto, Joel T. Dudley, Cesare Furlanello, Riccardo Miotto
Abstract	Objective: Deriving disease subtypes from electronic health records (EHRs) can guide next-generation personalized medicine. However, challenges in summarizing and representing patient data prevent widespread practice of scalable EHR-based stratification analysis. Here, we present a novel unsupervised framework based on deep learning to process heterogeneous EHRs and derive patient representations that can efficiently and effectively enable patient stratification at scale. Materials and methods: We considered EHRs of $1,608,741$ patients from a diverse hospital cohort comprising of a total of $57,464$ clinical concepts. We introduce a representation learning model based on word embeddings, convolutional neural networks and autoencoders (i.e., “ConvAE”) to transform patient trajectories into low-dimensional latent vectors. We evaluated these representations as broadly enabling patient stratification by applying hierarchical clustering to different multi-disease and disease-specific patient cohorts. Results: ConvAE significantly outperformed several common baselines in a clustering task to identify patients with different complex conditions, with $2.61$ entropy and $0.31$ purity average scores. When applied to stratify patients within a certain condition, ConvAE led to various clinically relevant subtypes for different disorders, including type 2 diabetes, Parkinson’s disease and Alzheimer’s disease, largely related to comorbidities, disease progression, and symptom severity. Conclusions: Patient representations derived from modeling EHRs with ConvAE can help develop personalized medicine therapeutic strategies and better understand varying etiologies in heterogeneous sub-populations.
Tasks	Representation Learning, Word Embeddings
Published	2020-03-14
URL	https://arxiv.org/abs/2003.06516v1
PDF	https://arxiv.org/pdf/2003.06516v1.pdf
PWC	https://paperswithcode.com/paper/deep-representation-learning-of-electronic
Repo
Framework

Membership Inference Attacks and Defenses in Supervised Learning via Generalization Gap


Title	Membership Inference Attacks and Defenses in Supervised Learning via Generalization Gap
Authors	Jiacheng Li, Ninghui Li, Bruno Ribeiro
Abstract	This work studies membership inference (MI) attack against classifiers, where the attacker’s goal is to determine whether a data instance was used for training the classifier. While it is known that overfitting makes classifiers susceptible to MI attacks, we showcase a simple numerical relationship between the generalization gap—the difference between training and test accuracies—and the classifier’s vulnerability to MI attacks—as measured by an MI attack’s accuracy gain over a random guess. We then propose to close the gap by matching the training and validation accuracies during training, by means of a new {\em set regularizer} using the Maximum Mean Discrepancy between the softmax output empirical distributions of the training and validation sets. Our experimental results show that combining this approach with another simple defense (mix-up training) significantly improves state-of-the-art defense against MI attacks, with minimal impact on testing accuracy.
Tasks
Published	2020-02-27
URL	https://arxiv.org/abs/2002.12062v1
PDF	https://arxiv.org/pdf/2002.12062v1.pdf
PWC	https://paperswithcode.com/paper/membership-inference-attacks-and-defenses-in
Repo
Framework

Discovering linguistic (ir)regularities in word embeddings through max-margin separating hyperplanes


Title	Discovering linguistic (ir)regularities in word embeddings through max-margin separating hyperplanes
Authors	Noel Kennedy, Imogen Schofield, Dave C. Brodbelt, David B. Church, Dan G. O’Neill
Abstract	We experiment with new methods for learning how related words are positioned relative to each other in word embedding spaces. Previous approaches learned constant vector offsets: vectors that point from source tokens to target tokens with an assumption that these offsets were parallel to each other. We show that the offsets between related tokens are closer to orthogonal than parallel, and that they have low cosine similarities. We proceed by making a different assumption; target tokens are linearly separable from source and un-labeled tokens. We show that a max-margin hyperplane can separate target tokens and that vectors orthogonal to this hyperplane represent the relationship between source and targets. We find that this representation of the relationship obtains the best results in dis-covering linguistic regularities. We experiment with vector space models trained by a variety of algorithms (Word2vec: CBOW/skip-gram, fastText, or GloVe), and various word context choices such as linear word-order, syntax dependency grammars, and with and without knowledge of word position. These experiments show that our model, SVMCos, is robust to a range of experimental choices when training word embeddings.
Tasks	Word Embeddings
Published	2020-03-07
URL	https://arxiv.org/abs/2003.03654v1
PDF	https://arxiv.org/pdf/2003.03654v1.pdf
PWC	https://paperswithcode.com/paper/discovering-linguistic-irregularities-in-word
Repo
Framework

Population-Guided Parallel Policy Search for Reinforcement Learning


Title	Population-Guided Parallel Policy Search for Reinforcement Learning
Authors	Whiyoung Jung, Giseung Park, Youngchul Sung
Abstract	In this paper, a new population-guided parallel learning scheme is proposed to enhance the performance of off-policy reinforcement learning (RL). In the proposed scheme, multiple identical learners with their own value-functions and policies share a common experience replay buffer, and search a good policy in collaboration with the guidance of the best policy information. The key point is that the information of the best policy is fused in a soft manner by constructing an augmented loss function for policy update to enlarge the overall search region by the multiple learners. The guidance by the previous best policy and the enlarged range enable faster and better policy search. Monotone improvement of the expected cumulative return by the proposed scheme is proved theoretically. Working algorithms are constructed by applying the proposed scheme to the twin delayed deep deterministic (TD3) policy gradient algorithm. Numerical results show that the constructed algorithm outperforms most of the current state-of-the-art RL algorithms, and the gain is significant in the case of sparse reward environment.
Tasks
Published	2020-01-09
URL	https://arxiv.org/abs/2001.02907v1
PDF	https://arxiv.org/pdf/2001.02907v1.pdf
PWC	https://paperswithcode.com/paper/population-guided-parallel-policy-search-for-1
Repo
Framework

The Incentives that Shape Behaviour


Title	The Incentives that Shape Behaviour
Authors	Ryan Carey, Eric Langlois, Tom Everitt, Shane Legg
Abstract	Which variables does an agent have an incentive to control with its decision, and which variables does it have an incentive to respond to? We formalise these incentives, and demonstrate unique graphical criteria for detecting them in any single decision causal influence diagram. To this end, we introduce structural causal influence models, a hybrid of the influence diagram and structural causal model frameworks. Finally, we illustrate how these incentives predict agent incentives in both fairness and AI safety applications.
Tasks
Published	2020-01-20
URL	https://arxiv.org/abs/2001.07118v1
PDF	https://arxiv.org/pdf/2001.07118v1.pdf
PWC	https://paperswithcode.com/paper/the-incentives-that-shape-behaviour
Repo
Framework

Practical method to reclassify Web of Science articles into unique subject categories and broad disciplines


Title	Practical method to reclassify Web of Science articles into unique subject categories and broad disciplines
Authors	Staša Milojević
Abstract	Classification of bibliographic items into subjects and disciplines in large databases is essential for many quantitative science studies. The Web of Science classification of journals into ~250 subject categories, which has served as a basis for many studies, is known to have some fundamental problems and several practical limitations that may affect the results from such studies. Here we present an easily reproducible method to perform reclassification of the Web of Science into existing subject categories and into 14 broad areas. Our reclassification is at a level of articles, so it preserves disciplinary differences that may exist among individual articles published in the same journal. Reclassification also eliminates ambiguous (multiple) categories that are found for 50% of items, and assigns a discipline/field category to all articles that come from broad-coverage journals such as Nature and Science. The correctness of the assigned subject categories is evaluated manually and is found to be ~95%.
Tasks
Published	2020-01-08
URL	https://arxiv.org/abs/2001.02733v1
PDF	https://arxiv.org/pdf/2001.02733v1.pdf
PWC	https://paperswithcode.com/paper/practical-method-to-reclassify-web-of-science
Repo
Framework

Task-Independent Spiking Central Pattern Generator: A Learning-Based Approach


Title	Task-Independent Spiking Central Pattern Generator: A Learning-Based Approach
Authors	Elie Aljalbout, Florian Walter, Florian Röhrbein, Alois Knoll
Abstract	Legged locomotion is a challenging task in the field of robotics but a rather simple one in nature. This motivates the use of biological methodologies as solutions to this problem. Central pattern generators are neural networks that are thought to be responsible for locomotion in humans and some animal species. As for robotics, many attempts were made to reproduce such systems and use them for a similar goal. One interesting design model is based on spiking neural networks. This model is the main focus of this work, as its contribution is not limited to engineering but also applicable to neuroscience. This paper introduces a new general framework for building central pattern generators that are task-independent, biologically plausible, and rely on learning methods. The abilities and properties of the presented approach are not only evaluated in simulation but also in a robotic experiment. The results are very promising as the used robot was able to perform stable walking at different speeds and to change speed within the same gait cycle.
Tasks
Published	2020-03-17
URL	https://arxiv.org/abs/2003.07477v1
PDF	https://arxiv.org/pdf/2003.07477v1.pdf
PWC	https://paperswithcode.com/paper/task-independent-spiking-central-pattern
Repo
Framework

Towards Patient Record Summarization Through Joint Phenotype Learning in HIV Patients


Title	Towards Patient Record Summarization Through Joint Phenotype Learning in HIV Patients
Authors	Gal Levy-Fix, Jason Zucker, Konstantin Stojanovic, Noémie Elhadad
Abstract	Identifying a patient’s key problems over time is a common task for providers at the point care, yet a complex and time-consuming activity given current electric health records. To enable a problem-oriented summarizer to identify a patient’s comprehensive list of problems and their salience, we propose an unsupervised phenotyping approach that jointly learns a large number of phenotypes/problems across structured and unstructured data. To identify the appropriate granularity of the learned phenotypes, the model is trained on a target patient population of the same clinic. To enable the content organization of a problem-oriented summarizer, the model identifies phenotype relatedness as well. The model leverages a correlated-mixed membership approach with variational inference applied to heterogenous clinical data. In this paper, we focus our experiments on assessing the learned phenotypes and their relatedness as learned from a specific patient population. We ground our experiments in phenotyping patients from an HIV clinic in a large urban care institution (n=7,523), where patients have voluminous, longitudinal documentation, and where providers would benefit from summaries of these patient’s medical histories, whether about their HIV or any comorbidities. We find that the learned phenotypes and their relatedness are clinically valid when assessed qualitatively by clinical experts, and that the model surpasses baseline in inferring phenotype-relatedness when comparing to existing expert-curated condition groupings.
Tasks
Published	2020-03-09
URL	https://arxiv.org/abs/2003.11474v1
PDF	https://arxiv.org/pdf/2003.11474v1.pdf
PWC	https://paperswithcode.com/paper/towards-patient-record-summarization-through
Repo
Framework

A Rigorous Framework for the Mean Field Limit of Multilayer Neural Networks


Title	A Rigorous Framework for the Mean Field Limit of Multilayer Neural Networks
Authors	Phan-Minh Nguyen, Huy Tuan Pham
Abstract	We develop a mathematically rigorous framework for multilayer neural networks in the mean field regime. As the network’s width increases, the network’s learning trajectory is shown to be well captured by a meaningful and dynamically nonlinear limit (the \textit{mean field} limit), which is characterized by a system of ODEs. Our framework applies to a broad range of network architectures, learning dynamics and network initializations. Central to the framework is the new idea of a \textit{neuronal embedding}, which comprises of a non-evolving probability space that allows to embed neural networks of arbitrary widths. We demonstrate two applications of our framework. Firstly the framework gives a principled way to study the simplifying effects that independent and identically distributed initializations have on the mean field limit. Secondly we prove a global convergence guarantee for two-layer and three-layer networks. Unlike previous works that rely on convexity, our result requires a certain universal approximation property, which is a distinctive feature of infinite-width neural networks. To the best of our knowledge, this is the first time global convergence is established for neural networks of more than two layers in the mean field regime.
Tasks
Published	2020-01-30
URL	https://arxiv.org/abs/2001.11443v1
PDF	https://arxiv.org/pdf/2001.11443v1.pdf
PWC	https://paperswithcode.com/paper/a-rigorous-framework-for-the-mean-field-limit
Repo
Framework