April 2, 2020

3015 words 15 mins read

Paper Group ANR 122

Towards precise causal effect estimation from data with hidden variables. Enhanced Rolling Horizon Evolution Algorithm with Opponent Model Learning: Results for the Fighting Game AI Competition. A Formal Analysis of Multimodal Referring Strategies Under Common Ground. Learning medical triage from clinicians using Deep Q-Learning. On the Distributio …

Towards precise causal effect estimation from data with hidden variables


Title	Towards precise causal effect estimation from data with hidden variables
Authors	Debo Cheng, Jiuyong Li, Lin Liu, Kui Yu, Thuc Duy Lee, Jixue Liu
Abstract	Causal effect estimation from observational data is a crucial but challenging task. Currently, only a limited number of data-driven causal effect estimation methods are available. These methods either only provide a bound estimation of the causal effect of a treatment on the outcome, or have impractical assumptions on the data or low efficiency although providing a unique estimation of the causal effect. In this paper, we identify a practical problem setting and propose an approach to achieving unique causal effect estimation from data with hidden variables under this setting. For the approach, we develop the theorems to support the discovery of the proper covariate sets for confounding adjustment (adjustment sets). Based on the theorems, two algorithms are presented for finding the proper adjustment sets from data with hidden variables to obtain unbiased and unique causal effect estimation. Experiments with benchmark Bayesian networks and real-world datasets have demonstrated the efficiency and effectiveness of the proposed algorithms, indicating the practicability of the identified problem setting and the potential of the approach in real-world applications.
Tasks
Published	2020-02-24
URL	https://arxiv.org/abs/2002.10091v1
PDF	https://arxiv.org/pdf/2002.10091v1.pdf
PWC	https://paperswithcode.com/paper/towards-precise-causal-effect-estimation-from
Repo
Framework

Enhanced Rolling Horizon Evolution Algorithm with Opponent Model Learning: Results for the Fighting Game AI Competition


Title	Enhanced Rolling Horizon Evolution Algorithm with Opponent Model Learning: Results for the Fighting Game AI Competition
Authors	Zhentao Tang, Yuanheng Zhu, Dongbin Zhao, Simon M. Lucas
Abstract	The Fighting Game AI Competition (FTGAIC) provides a challenging benchmark for 2-player video game AI. The challenge arises from the large action space, diverse styles of characters and abilities, and the real-time nature of the game. In this paper, we propose a novel algorithm that combines Rolling Horizon Evolution Algorithm (RHEA) with opponent model learning. The approach is readily applicable to any 2-player video game. In contrast to conventional RHEA, an opponent model is proposed and is optimized by supervised learning with cross-entropy and reinforcement learning with policy gradient and Q-learning respectively, based on history observations from opponent. The model is learned during the live gameplay. With the learned opponent model, the extended RHEA is able to make more realistic plans based on what the opponent is likely to do. This tends to lead to better results. We compared our approach directly with the bots from the FTGAIC 2018 competition, and found our method to significantly outperform all of them, for all three character. Furthermore, our proposed bot with the policy-gradient-based opponent model is the only one without using Monte-Carlo Tree Search (MCTS) among top five bots in the 2019 competition in which it achieved second place, while using much less domain knowledge than the winner.
Tasks	Q-Learning
Published	2020-03-31
URL	https://arxiv.org/abs/2003.13949v1
PDF	https://arxiv.org/pdf/2003.13949v1.pdf
PWC	https://paperswithcode.com/paper/enhanced-rolling-horizon-evolution-algorithm
Repo
Framework

A Formal Analysis of Multimodal Referring Strategies Under Common Ground


Title	A Formal Analysis of Multimodal Referring Strategies Under Common Ground
Authors	Nikhil Krishnaswamy, James Pustejovsky
Abstract	In this paper, we present an analysis of computationally generated mixed-modality definite referring expressions using combinations of gesture and linguistic descriptions. In doing so, we expose some striking formal semantic properties of the interactions between gesture and language, conditioned on the introduction of content into the common ground between the (computational) speaker and (human) viewer, and demonstrate how these formal features can contribute to training better models to predict viewer judgment of referring expressions, and potentially to the generation of more natural and informative referring expressions.
Tasks
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07385v1
PDF	https://arxiv.org/pdf/2003.07385v1.pdf
PWC	https://paperswithcode.com/paper/a-formal-analysis-of-multimodal-referring
Repo
Framework

Learning medical triage from clinicians using Deep Q-Learning


Title	Learning medical triage from clinicians using Deep Q-Learning
Authors	Albert Buchard, Baptiste Bouvier, Giulia Prando, Rory Beard, Michail Livieratos, Dan Busbridge, Daniel Thompson, Jonathan Richens, Yuanzhao Zhang, Adam Baker, Yura Perov, Kostis Gourgoulias, Saurabh Johri
Abstract	Medical Triage is of paramount importance to healthcare systems, allowing for the correct orientation of patients and allocation of the necessary resources to treat them adequately. While reliable decision-tree methods exist to triage patients based on their presentation, those trees implicitly require human inference and are not immediately applicable in a fully automated setting. On the other hand, learning triage policies directly from experts may correct for some of the limitations of hard-coded decision-trees. In this work, we present a Deep Reinforcement Learning approach (a variant of DeepQ-Learning) to triage patients using curated clinical vignettes. The dataset, consisting of 1374 clinical vignettes, was created by medical doctors to represent real-life cases. Each vignette is associated with an average of 3.8 expert triage decisions given by medical doctors relying solely on medical history. We show that this approach is on a par with human performance, yielding safe triage decisions in 94% of cases, and matching expert decisions in 85% of cases. The trained agent learns when to stop asking questions, acquires optimized decision policies requiring less evidence than supervised approaches, and adapts to the novelty of a situation by asking for more information. Overall, we demonstrate that a Deep Reinforcement Learning approach can learn effective medical triage policies directly from expert decisions, without requiring expert knowledge engineering. This approach is scalable and can be deployed in healthcare settings or geographical regions with distinct triage specifications, or where trained experts are scarce, to improve decision making in the early stage of care.
Tasks	Decision Making, Q-Learning
Published	2020-03-28
URL	https://arxiv.org/abs/2003.12828v1
PDF	https://arxiv.org/pdf/2003.12828v1.pdf
PWC	https://paperswithcode.com/paper/learning-medical-triage-from-clinicians-using
Repo
Framework

On the Distribution of Minima in Intrinsic-Metric Rotation Averaging


Title	On the Distribution of Minima in Intrinsic-Metric Rotation Averaging
Authors	Kyle Wilson, David Bindel
Abstract	Rotation Averaging is a non-convex optimization problem that determines orientations of a collection of cameras from their images of a 3D scene. The problem has been studied using a variety of distances and robustifiers. The intrinsic (or geodesic) distance on SO(3) is geometrically meaningful; but while some extrinsic distance-based solvers admit (conditional) guarantees of correctness, no comparable results have been found under the intrinsic metric. In this paper, we study the spatial distribution of local minima. First, we do a novel empirical study to demonstrate sharp transitions in qualitative behavior: as problems become noisier, they transition from a single (easy-to-find) dominant minimum to a cost surface filled with minima. In the second part of this paper we derive a theoretical bound for when this transition occurs. This is an extension of the results of [24], which used local convexity as a proxy to study the difficulty of problem. By recognizing the underlying quotient manifold geometry of the problem we achieve an n-fold improvement over prior work. Incidentally, our analysis also extends the prior $l_2$ work to general $l_p$ costs. Our results suggest using algebraic connectivity as an indicator of problem difficulty.
Tasks
Published	2020-03-18
URL	https://arxiv.org/abs/2003.08310v1
PDF	https://arxiv.org/pdf/2003.08310v1.pdf
PWC	https://paperswithcode.com/paper/on-the-distribution-of-minima-in-intrinsic
Repo
Framework

Model-Free Algorithm and Regret Analysis for MDPs with Peak Constraints


Title	Model-Free Algorithm and Regret Analysis for MDPs with Peak Constraints
Authors	Qinbo Bai, Ather Gattami, Vaneet Aggarwal
Abstract	In the optimization of dynamic systems, the variables typically have constraints. Such problems can be modeled as a constrained Markov Decision Process (MDP). This paper considers a model-free approach to the problem, where the transition probabilities are not known. In the presence of peak constraints, the agent has to choose the policy to maximize the long-term average reward as well as satisfy the constraints at each time. We propose modifications to the standard Q-learning problem for unconstrained optimization to come up with an algorithm with peak constraints. The proposed algorithm is shown to achieve $O(T^{1/2+\gamma})$ regret bound for the obtained reward, and $O(T^{1-\gamma})$ regret bound for the constraint violation for any $\gamma \in(0,1/2)$ and time-horizon $T$. We note that these are the first results on regret analysis for constrained MDP, where the transition problems are not known apriori. We demonstrate the proposed algorithm on an energy harvesting problem where it outperforms state-of-the-art and performs close to the theoretical upper bound of the studied optimization problem.
Tasks	Q-Learning
Published	2020-03-11
URL	https://arxiv.org/abs/2003.05555v1
PDF	https://arxiv.org/pdf/2003.05555v1.pdf
PWC	https://paperswithcode.com/paper/model-free-algorithm-and-regret-analysis-for
Repo
Framework

Warm Starting Bandits with Side Information from Confounded Data


Title	Warm Starting Bandits with Side Information from Confounded Data
Authors	Nihal Sharma, Soumya Basu, Karthikeyan Shanmugam, Sanjay Shakkottai
Abstract	We study a variant of the multi-armed bandit problem where side information in the form of bounds on the mean of each arm is provided. We describe how these bounds on the means can be used efficiently for warm starting bandits. Specifically, we propose the novel UCB-SI algorithm, and illustrate improvements in cumulative regret over the standard UCB algorithm, both theoretically and empirically, in the presence of non-trivial side information. As noted in (Zhang & Bareinboim, 2017), such information arises, for instance, when we have prior logged data on the arms, but this data has been collected under a policy whose choice of arms is based on latent variables to which access is no longer available. We further provide a novel approach for obtaining such bounds from prior partially confounded data under some mild assumptions. We validate our findings through semi-synthetic experiments on data derived from real datasets.
Tasks
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08405v1
PDF	https://arxiv.org/pdf/2002.08405v1.pdf
PWC	https://paperswithcode.com/paper/warm-starting-bandits-with-side-information
Repo
Framework

Indirect and Direct Training of Spiking Neural Networks for End-to-End Control of a Lane-Keeping Vehicle


Title	Indirect and Direct Training of Spiking Neural Networks for End-to-End Control of a Lane-Keeping Vehicle
Authors	Zhenshan Bing, Claus Meschede, Guang Chen, Alois Knoll, Kai Huang
Abstract	Building spiking neural networks (SNNs) based on biological synaptic plasticities holds a promising potential for accomplishing fast and energy-efficient computing, which is beneficial to mobile robotic applications. However, the implementations of SNNs in robotic fields are limited due to the lack of practical training methods. In this paper, we therefore introduce both indirect and direct end-to-end training methods of SNNs for a lane-keeping vehicle. First, we adopt a policy learned using the \textcolor{black}{Deep Q-Learning} (DQN) algorithm and then subsequently transfer it to an SNN using supervised learning. Second, we adopt the reward-modulated spike-timing-dependent plasticity (R-STDP) for training SNNs directly, since it combines the advantages of both reinforcement learning and the well-known spike-timing-dependent plasticity (STDP). We examine the proposed approaches in three scenarios in which a robot is controlled to keep within lane markings by using an event-based neuromorphic vision sensor. We further demonstrate the advantages of the R-STDP approach in terms of the lateral localization accuracy and training time steps by comparing them with other three algorithms presented in this paper.
Tasks	Q-Learning
Published	2020-03-10
URL	https://arxiv.org/abs/2003.04603v1
PDF	https://arxiv.org/pdf/2003.04603v1.pdf
PWC	https://paperswithcode.com/paper/indirect-and-direct-training-of-spiking
Repo
Framework

Private Machine Learning via Randomised Response


Title	Private Machine Learning via Randomised Response
Authors	David Barber
Abstract	We introduce a general learning framework for private machine learning based on randomised response. Our assumption is that all actors are potentially adversarial and as such we trust only to release a single noisy version of an individual’s datapoint. We discuss a general approach that forms a consistent way to estimate the true underlying machine learning model and demonstrate this in the case of logistic regression.
Tasks
Published	2020-01-14
URL	https://arxiv.org/abs/2001.04942v2
PDF	https://arxiv.org/pdf/2001.04942v2.pdf
PWC	https://paperswithcode.com/paper/private-machine-learning-via-randomised
Repo
Framework

Breaking Batch Normalization for better explainability of Deep Neural Networks through Layer-wise Relevance Propagation


Title	Breaking Batch Normalization for better explainability of Deep Neural Networks through Layer-wise Relevance Propagation
Authors	Mathilde Guillemot, Catherine Heusele, Rodolphe Korichi, Sylvianne Schnebert, Liming Chen
Abstract	The lack of transparency of neural networks stays a major break for their use. The Layerwise Relevance Propagation technique builds heat-maps representing the relevance of each input in the model s decision. The relevance spreads backward from the last to the first layer of the Deep Neural Network. Layer-wise Relevance Propagation does not manage normalization layers, in this work we suggest a method to include normalization layers. Specifically, we build an equivalent network fusing normalization layers and convolutional or fully connected layers. Heatmaps obtained with our method on MNIST and CIFAR 10 datasets are more accurate for convolutional layers. Our study also prevents from using Layerwise Relevance Propagation with networks including a combination of connected layers and normalization layer.
Tasks
Published	2020-02-24
URL	https://arxiv.org/abs/2002.11018v1
PDF	https://arxiv.org/pdf/2002.11018v1.pdf
PWC	https://paperswithcode.com/paper/breaking-batch-normalization-for-better
Repo
Framework

A New Gene Selection Algorithm using Fuzzy-Rough Set Theory for Tumor Classification


Title	A New Gene Selection Algorithm using Fuzzy-Rough Set Theory for Tumor Classification
Authors	Seyedeh Faezeh Farahbakhshian, Milad Taleby Ahvanooey
Abstract	In statistics and machine learning, feature selection is the process of picking a subset of relevant attributes for utilizing in a predictive model. Recently, rough set-based feature selection techniques, that employ feature dependency to perform selection process, have been drawn attention. Classification of tumors based on gene expression is utilized to diagnose proper treatment and prognosis of the disease in bioinformatics applications. Microarray gene expression data includes superfluous feature genes of high dimensionality and smaller training instances. Since exact supervised classification of gene expression instances in such high-dimensional problems is very complex, the selection of appropriate genes is a crucial task for tumor classification. In this study, we present a new technique for gene selection using a discernibility matrix of fuzzy-rough sets. The proposed technique takes into account the similarity of those instances that have the same and different class labels to improve the gene selection results, while the state-of-the art previous approaches only address the similarity of instances with different class labels. To meet that requirement, we extend the Johnson reducer technique into the fuzzy case. Experimental results demonstrate that this technique provides better efficiency compared to the state-of-the-art approaches.
Tasks	Feature Selection
Published	2020-03-26
URL	https://arxiv.org/abs/2003.12386v1
PDF	https://arxiv.org/pdf/2003.12386v1.pdf
PWC	https://paperswithcode.com/paper/a-new-gene-selection-algorithm-using-fuzzy
Repo
Framework

Unifying Theorems for Subspace Identification and Dynamic Mode Decomposition


Title	Unifying Theorems for Subspace Identification and Dynamic Mode Decomposition
Authors	Sungho Shin, Qiugang Lu, Victor M. Zavala
Abstract	This paper presents unifying results for subspace identification (SID) and dynamic mode decomposition (DMD) for autonomous dynamical systems. We observe that SID seeks to solve an optimization problem to estimate an extended observability matrix and a state sequence that minimizes the prediction error for the state-space model. Moreover, we observe that DMD seeks to solve a rank-constrained matrix regression problem that minimizes the prediction error of an extended autoregressive model. We prove that existence conditions for perfect (error-free) state-space and low-rank extended autoregressive models are equivalent and that the SID and DMD optimization problems are equivalent. We exploit these results to propose a SID-DMD algorithm that delivers a provably optimal model and that is easy to implement. We demonstrate our developments using a case study that aims to build dynamical models directly from video data.
Tasks
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07410v1
PDF	https://arxiv.org/pdf/2003.07410v1.pdf
PWC	https://paperswithcode.com/paper/unifying-theorems-for-subspace-identification
Repo
Framework

Dropout Strikes Back: Improved Uncertainty Estimation via Diversity Sampled Implicit Ensembles


Title	Dropout Strikes Back: Improved Uncertainty Estimation via Diversity Sampled Implicit Ensembles
Authors	Evgenii Tsymbalov, Kirill Fedyanin, Maxim Panov
Abstract	Modern machine learning models usually do not extrapolate well, i.e., they often have high prediction errors in the regions of sample space lying far from the training data. In high dimensional spaces detecting out-of-distribution points becomes a non-trivial problem. Thus, uncertainty estimation for model predictions becomes crucial for the successful application of machine learning models in many applications. In this work, we show that increasing the diversity of realizations sampled from a neural network with dropout helps to improve the quality of uncertainty estimation. In a series of experiments on simulated and real-world data, we demonstrate that diversification via determinantal point processes-based sampling allows achieving state-of-the-art results in uncertainty estimation for regression and classification tasks. Importantly, our approach does not require any modification to the models or training procedures, allowing for straightforward application to any deep learning model with dropout layers.
Tasks	Point Processes
Published	2020-03-06
URL	https://arxiv.org/abs/2003.03274v1
PDF	https://arxiv.org/pdf/2003.03274v1.pdf
PWC	https://paperswithcode.com/paper/dropout-strikes-back-improved-uncertainty
Repo
Framework

Pretrained Transformers for Simple Question Answering over Knowledge Graphs


Title	Pretrained Transformers for Simple Question Answering over Knowledge Graphs
Authors	D. Lukovnikov, A. Fischer, J. Lehmann
Abstract	Answering simple questions over knowledge graphs is a well-studied problem in question answering. Previous approaches for this task built on recurrent and convolutional neural network based architectures that use pretrained word embeddings. It was recently shown that finetuning pretrained transformer networks (e.g. BERT) can outperform previous approaches on various natural language processing tasks. In this work, we investigate how well BERT performs on SimpleQuestions and provide an evaluation of both BERT and BiLSTM-based models in datasparse scenarios.
Tasks	Knowledge Graphs, Question Answering, Word Embeddings
Published	2020-01-31
URL	https://arxiv.org/abs/2001.11985v1
PDF	https://arxiv.org/pdf/2001.11985v1.pdf
PWC	https://paperswithcode.com/paper/pretrained-transformers-for-simple-question
Repo
Framework

Mic2Mic: Using Cycle-Consistent Generative Adversarial Networks to Overcome Microphone Variability in Speech Systems


Title	Mic2Mic: Using Cycle-Consistent Generative Adversarial Networks to Overcome Microphone Variability in Speech Systems
Authors	Akhil Mathur, Anton Isopoussu, Fahim Kawsar, Nadia Berthouze, Nicholas D. Lane
Abstract	Mobile and embedded devices are increasingly using microphones and audio-based computational models to infer user context. A major challenge in building systems that combine audio models with commodity microphones is to guarantee their accuracy and robustness in the real-world. Besides many environmental dynamics, a primary factor that impacts the robustness of audio models is microphone variability. In this work, we propose Mic2Mic – a machine-learned system component – which resides in the inference pipeline of audio models and at real-time reduces the variability in audio data caused by microphone-specific factors. Two key considerations for the design of Mic2Mic were: a) to decouple the problem of microphone variability from the audio task, and b) put a minimal burden on end-users to provide training data. With these in mind, we apply the principles of cycle-consistent generative adversarial networks (CycleGANs) to learn Mic2Mic using unlabeled and unpaired data collected from different microphones. Our experiments show that Mic2Mic can recover between 66% to 89% of the accuracy lost due to microphone variability for two common audio tasks.
Tasks
Published	2020-03-27
URL	https://arxiv.org/abs/2003.12425v1
PDF	https://arxiv.org/pdf/2003.12425v1.pdf
PWC	https://paperswithcode.com/paper/mic2mic-using-cycle-consistent-generative
Repo
Framework