May 6, 2019

2885 words 14 mins read

Paper Group ANR 331

Multi-Agent Path Finding with Delay Probabilities. Building Machines That Learn and Think Like People. Free Lunch for Optimisation under the Universal Distribution. Predicting the Category and Attributes of Visual Search Targets Using Deep Gaze Pooling. AI Researchers, Video Games Are Your Friends!. Deep Reinforcement Learning From Raw Pixels in Do …

Multi-Agent Path Finding with Delay Probabilities


Title	Multi-Agent Path Finding with Delay Probabilities
Authors	Hang Ma, T. K. Satish Kumar, Sven Koenig
Abstract	Several recently developed Multi-Agent Path Finding (MAPF) solvers scale to large MAPF instances by searching for MAPF plans on 2 levels: The high-level search resolves collisions between agents, and the low-level search plans paths for single agents under the constraints imposed by the high-level search. We make the following contributions to solve the MAPF problem with imperfect plan execution with small average makespans: First, we formalize the MAPF Problem with Delay Probabilities (MAPF-DP), define valid MAPF-DP plans and propose the use of robust plan-execution policies for valid MAPF-DP plans to control how each agent proceeds along its path. Second, we discuss 2 classes of decentralized robust plan-execution policies (called Fully Synchronized Policies and Minimal Communication Policies) that prevent collisions during plan execution for valid MAPF-DP plans. Third, we present a 2-level MAPF-DP solver (called Approximate Minimization in Expectation) that generates valid MAPF-DP plans.
Tasks	Multi-Agent Path Finding
Published	2016-12-15
URL	http://arxiv.org/abs/1612.05309v1
PDF	http://arxiv.org/pdf/1612.05309v1.pdf
PWC	https://paperswithcode.com/paper/multi-agent-path-finding-with-delay
Repo
Framework

Building Machines That Learn and Think Like People


Title	Building Machines That Learn and Think Like People
Authors	Brenden M. Lake, Tomer D. Ullman, Joshua B. Tenenbaum, Samuel J. Gershman
Abstract	Recent progress in artificial intelligence (AI) has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn, and how they learn it. Specifically, we argue that these machines should (a) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (b) ground learning in intuitive theories of physics and psychology, to support and enrich the knowledge that is learned; and (c) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes towards these goals that can combine the strengths of recent neural network advances with more structured cognitive models.
Tasks	Board Games, Object Recognition
Published	2016-04-01
URL	http://arxiv.org/abs/1604.00289v3
PDF	http://arxiv.org/pdf/1604.00289v3.pdf
PWC	https://paperswithcode.com/paper/building-machines-that-learn-and-think-like
Repo
Framework

Free Lunch for Optimisation under the Universal Distribution


Title	Free Lunch for Optimisation under the Universal Distribution
Authors	Tom Everitt, Tor Lattimore, Marcus Hutter
Abstract	Function optimisation is a major challenge in computer science. The No Free Lunch theorems state that if all functions with the same histogram are assumed to be equally probable then no algorithm outperforms any other in expectation. We argue against the uniform assumption and suggest a universal prior exists for which there is a free lunch, but where no particular class of functions is favoured over another. We also prove upper and lower bounds on the size of the free lunch.
Tasks
Published	2016-08-16
URL	http://arxiv.org/abs/1608.04544v1
PDF	http://arxiv.org/pdf/1608.04544v1.pdf
PWC	https://paperswithcode.com/paper/free-lunch-for-optimisation-under-the
Repo
Framework

Predicting the Category and Attributes of Visual Search Targets Using Deep Gaze Pooling


Title	Predicting the Category and Attributes of Visual Search Targets Using Deep Gaze Pooling
Authors	Hosnieh Sattar, Andreas Bulling, Mario Fritz
Abstract	Predicting the target of visual search from eye fixation (gaze) data is a challenging problem with many applications in human-computer interaction. In contrast to previous work that has focused on individual instances as a search target, we propose the first approach to predict categories and attributes of search targets based on gaze data. However, state of the art models for categorical recognition, in general, require large amounts of training data, which is prohibitive for gaze data. To address this challenge, we propose a novel Gaze Pooling Layer that integrates gaze information into CNN-based architectures as an attention mechanism - incorporating both spatial and temporal aspects of human gaze behavior. We show that our approach is effective even when the gaze pooling layer is added to an already trained CNN, thus eliminating the need for expensive joint data collection of visual and gaze data. We propose an experimental setup and data set and demonstrate the effectiveness of our method for search target prediction based on gaze behavior. We further study how to integrate temporal and spatial gaze information most effectively, and indicate directions for future research in the gaze-based prediction of mental states.
Tasks
Published	2016-11-27
URL	http://arxiv.org/abs/1611.10162v3
PDF	http://arxiv.org/pdf/1611.10162v3.pdf
PWC	https://paperswithcode.com/paper/predicting-the-category-and-attributes-of
Repo
Framework

AI Researchers, Video Games Are Your Friends!


Title	AI Researchers, Video Games Are Your Friends!
Authors	Julian Togelius
Abstract	If you are an artificial intelligence researcher, you should look to video games as ideal testbeds for the work you do. If you are a video game developer, you should look to AI for the technology that makes completely new types of games possible. This chapter lays out the case for both of these propositions. It asks the question “what can video games do for AI”, and discusses how in particular general video game playing is the ideal testbed for artificial general intelligence research. It then asks the question “what can AI do for video games”, and lays out a vision for what video games might look like if we had significantly more advanced AI at our disposal. The chapter is based on my keynote at IJCCI 2015, and is written in an attempt to be accessible to a broad audience.
Tasks
Published	2016-12-06
URL	http://arxiv.org/abs/1612.01608v1
PDF	http://arxiv.org/pdf/1612.01608v1.pdf
PWC	https://paperswithcode.com/paper/ai-researchers-video-games-are-your-friends
Repo
Framework

Deep Reinforcement Learning From Raw Pixels in Doom


Title	Deep Reinforcement Learning From Raw Pixels in Doom
Authors	Danijar Hafner
Abstract	Using current reinforcement learning methods, it has recently become possible to learn to play unknown 3D games from raw pixels. In this work, we study the challenges that arise in such complex environments, and summarize current methods to approach these. We choose a task within the Doom game, that has not been approached yet. The goal for the agent is to fight enemies in a 3D world consisting of five rooms. We train the DQN and LSTM-A3C algorithms on this task. Results show that both algorithms learn sensible policies, but fail to achieve high scores given the amount of training. We provide insights into the learned behavior, which can serve as a valuable starting point for further research in the Doom domain.
Tasks
Published	2016-10-07
URL	http://arxiv.org/abs/1610.02164v1
PDF	http://arxiv.org/pdf/1610.02164v1.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-from-raw-pixels
Repo
Framework

Identifying Metastases in Sentinel Lymph Nodes with Deep Convolutional Neural Networks


Title	Identifying Metastases in Sentinel Lymph Nodes with Deep Convolutional Neural Networks
Authors	Richard Chen, Yating Jing, Hunter Jackson
Abstract	Metastatic presence in lymph nodes is one of the most important prognostic variables of breast cancer. The current diagnostic procedure for manually reviewing sentinel lymph nodes, however, is very time-consuming and subjective. Pathologists have to manually scan an entire digital whole-slide image (WSI) for regions of metastasis that are sometimes only detectable under high resolution or entirely hidden from the human visual cortex. From October 2015 to April 2016, the International Symposium on Biomedical Imaging (ISBI) held the Camelyon Grand Challenge 2016 to crowd-source ideas and algorithms for automatic detection of lymph node metastasis. Using a generalizable stain normalization technique and the Proscia Pathology Cloud computing platform, we trained a deep convolutional neural network on millions of tissue and tumor image tiles to perform slide-based evaluation on our testing set of whole-slide images images, with a sensitivity of 0.96, specificity of 0.89, and AUC score of 0.90. Our results indicate that our platform can automatically scan any WSI for metastatic regions without institutional calibration to respective stain profiles.
Tasks	Calibration
Published	2016-08-04
URL	http://arxiv.org/abs/1608.01658v1
PDF	http://arxiv.org/pdf/1608.01658v1.pdf
PWC	https://paperswithcode.com/paper/identifying-metastases-in-sentinel-lymph
Repo
Framework

Highly Efficient Regression for Scalable Person Re-Identification


Title	Highly Efficient Regression for Scalable Person Re-Identification
Authors	Hanxiao Wang, Shaogang Gong, Tao Xiang
Abstract	Existing person re-identification models are poor for scaling up to large data required in real-world applications due to: (1) Complexity: They employ complex models for optimal performance resulting in high computational cost for training at a large scale; (2) Inadaptability: Once trained, they are unsuitable for incremental update to incorporate any new data available. This work proposes a truly scalable solution to re-id by addressing both problems. Specifically, a Highly Efficient Regression (HER) model is formulated by embedding the Fisher’s criterion to a ridge regression model for very fast re-id model learning with scalable memory/storage usage. Importantly, this new HER model supports faster than real-time incremental model updates therefore making real-time active learning feasible in re-id with human-in-the-loop. Extensive experiments show that such a simple and fast model not only outperforms notably the state-of-the-art re-id methods, but also is more scalable to large data with additional benefits to active learning for reducing human labelling effort in re-id deployment.
Tasks	Active Learning, Person Re-Identification
Published	2016-12-05
URL	http://arxiv.org/abs/1612.01341v1
PDF	http://arxiv.org/pdf/1612.01341v1.pdf
PWC	https://paperswithcode.com/paper/highly-efficient-regression-for-scalable
Repo
Framework

An unexpected unity among methods for interpreting model predictions


Title	An unexpected unity among methods for interpreting model predictions
Authors	Scott Lundberg, Su-In Lee
Abstract	Understanding why a model made a certain prediction is crucial in many data science fields. Interpretable predictions engender appropriate trust and provide insight into how the model may be improved. However, with large modern datasets the best accuracy is often achieved by complex models even experts struggle to interpret, which creates a tension between accuracy and interpretability. Recently, several methods have been proposed for interpreting predictions from complex models by estimating the importance of input features. Here, we present how a model-agnostic additive representation of the importance of input features unifies current methods. This representation is optimal, in the sense that it is the only set of additive values that satisfies important properties. We show how we can leverage these properties to create novel visual explanations of model predictions. The thread of unity that this representation weaves through the literature indicates that there are common principles to be learned about the interpretation of model predictions that apply in many scenarios.
Tasks
Published	2016-11-22
URL	http://arxiv.org/abs/1611.07478v3
PDF	http://arxiv.org/pdf/1611.07478v3.pdf
PWC	https://paperswithcode.com/paper/an-unexpected-unity-among-methods-for
Repo
Framework

Active Learning for Approximation of Expensive Functions with Normal Distributed Output Uncertainty


Title	Active Learning for Approximation of Expensive Functions with Normal Distributed Output Uncertainty
Authors	Joachim van der Herten, Ivo Couckuyt, Dirk Deschrijver, Tom Dhaene
Abstract	When approximating a black-box function, sampling with active learning focussing on regions with non-linear responses tends to improve accuracy. We present the FLOLA-Voronoi method introduced previously for deterministic responses, and theoretically derive the impact of output uncertainty. The algorithm automatically puts more emphasis on exploration to provide more information to the models.
Tasks	Active Learning
Published	2016-08-18
URL	http://arxiv.org/abs/1608.05225v1
PDF	http://arxiv.org/pdf/1608.05225v1.pdf
PWC	https://paperswithcode.com/paper/active-learning-for-approximation-of
Repo
Framework

Possibilistic Networks: Parameters Learning from Imprecise Data and Evaluation strategy


Title	Possibilistic Networks: Parameters Learning from Imprecise Data and Evaluation strategy
Authors	Maroua Haddad, Philippe Leray, Nahla Ben Amor
Abstract	There has been an ever-increasing interest in multidisciplinary research on representing and reasoning with imperfect data. Possibilistic networks present one of the powerful frameworks of interest for representing uncertain and imprecise information. This paper covers the problem of their parameters learning from imprecise datasets, i.e., containing multi-valued data. We propose in the rst part of this paper a possibilistic networks sampling process. In the second part, we propose a likelihood function which explores the link between random sets theory and possibility theory. This function is then deployed to parametrize possibilistic networks.
Tasks
Published	2016-07-13
URL	http://arxiv.org/abs/1607.03705v1
PDF	http://arxiv.org/pdf/1607.03705v1.pdf
PWC	https://paperswithcode.com/paper/possibilistic-networks-parameters-learning
Repo
Framework


Title	Using Gaussian Processes for Rumour Stance Classification in Social Media
Authors	Michal Lukasik, Kalina Bontcheva, Trevor Cohn, Arkaitz Zubiaga, Maria Liakata, Rob Procter
Abstract	Social media tend to be rife with rumours while new reports are released piecemeal during breaking news. Interestingly, one can mine multiple reactions expressed by social media users in those situations, exploring their stance towards rumours, ultimately enabling the flagging of highly disputed rumours as being potentially false. In this work, we set out to develop an automated, supervised classifier that uses multi-task learning to classify the stance expressed in each individual tweet in a rumourous conversation as either supporting, denying or questioning the rumour. Using a classifier based on Gaussian Processes, and exploring its effectiveness on two datasets with very different characteristics and varying distributions of stances, we show that our approach consistently outperforms competitive baseline classifiers. Our classifier is especially effective in estimating the distribution of different types of stance associated with a given rumour, which we set forth as a desired characteristic for a rumour-tracking system that will warn both ordinary users of Twitter and professional news practitioners when a rumour is being rebutted.
Tasks	Gaussian Processes, Multi-Task Learning, Rumour Detection
Published	2016-09-07
URL	http://arxiv.org/abs/1609.01962v1
PDF	http://arxiv.org/pdf/1609.01962v1.pdf
PWC	https://paperswithcode.com/paper/using-gaussian-processes-for-rumour-stance
Repo
Framework

Nonparametric General Reinforcement Learning


Title	Nonparametric General Reinforcement Learning
Authors	Jan Leike
Abstract	Reinforcement learning (RL) problems are often phrased in terms of Markov decision processes (MDPs). In this thesis we go beyond MDPs and consider RL in environments that are non-Markovian, non-ergodic and only partially observable. Our focus is not on practical algorithms, but rather on the fundamental underlying problems: How do we balance exploration and exploitation? How do we explore optimally? When is an agent optimal? We follow the nonparametric realizable paradigm. We establish negative results on Bayesian RL agents, in particular AIXI. We show that unlucky or adversarial choices of the prior cause the agent to misbehave drastically. Therefore Legg-Hutter intelligence and balanced Pareto optimality, which depend crucially on the choice of the prior, are entirely subjective. Moreover, in the class of all computable environments every policy is Pareto optimal. This undermines all existing optimality properties for AIXI. However, there are Bayesian approaches to general RL that satisfy objective optimality guarantees: We prove that Thompson sampling is asymptotically optimal in stochastic environments in the sense that its value converges to the value of the optimal policy. We connect asymptotic optimality to regret given a recoverability assumption on the environment that allows the agent to recover from mistakes. Hence Thompson sampling achieves sublinear regret in these environments. Our results culminate in a formal solution to the grain of truth problem: A Bayesian agent acting in a multi-agent environment learns to predict the other agents’ policies if its prior assigns positive probability to them (the prior contains a grain of truth). We construct a large but limit computable class containing a grain of truth and show that agents based on Thompson sampling over this class converge to play Nash equilibria in arbitrary unknown computable multi-agent environments.
Tasks
Published	2016-11-28
URL	http://arxiv.org/abs/1611.08944v1
PDF	http://arxiv.org/pdf/1611.08944v1.pdf
PWC	https://paperswithcode.com/paper/nonparametric-general-reinforcement-learning
Repo
Framework

Fast $(1+ε)$-approximation of the Löwner extremal matrices of high-dimensional symmetric matrices


Title	Fast $(1+ε)$-approximation of the Löwner extremal matrices of high-dimensional symmetric matrices
Authors	Frank Nielsen, Richard Nock
Abstract	Matrix data sets are common nowadays like in biomedical imaging where the Diffusion Tensor Magnetic Resonance Imaging (DT-MRI) modality produces data sets of 3D symmetric positive definite matrices anchored at voxel positions capturing the anisotropic diffusion properties of water molecules in biological tissues. The space of symmetric matrices can be partially ordered using the L"owner ordering, and computing extremal matrices dominating a given set of matrices is a basic primitive used in matrix-valued signal processing. In this letter, we design a fast and easy-to-implement iterative algorithm to approximate arbitrarily finely these extremal matrices. Finally, we discuss on extensions to matrix clustering.
Tasks
Published	2016-04-06
URL	http://arxiv.org/abs/1604.01592v1
PDF	http://arxiv.org/pdf/1604.01592v1.pdf
PWC	https://paperswithcode.com/paper/fast-1-approximation-of-the-lowner-extremal
Repo
Framework

Scenario Submodular Cover


Title	Scenario Submodular Cover
Authors	Nathaniel Grammel, Lisa Hellerstein, Devorah Kletenik, Patrick Lin
Abstract	Many problems in Machine Learning can be modeled as submodular optimization problems. Recent work has focused on stochastic or adaptive versions of these problems. We consider the Scenario Submodular Cover problem, which is a counterpart to the Stochastic Submodular Cover problem studied by Golovin and Krause. In Scenario Submodular Cover, the goal is to produce a cover with minimum expected cost, where the expectation is with respect to an empirical joint distribution, given as input by a weighted sample of realizations. In contrast, in Stochastic Submodular Cover, the variables of the input distribution are assumed to be independent, and the distribution of each variable is given as input. Building on algorithms developed by Cicalese et al. and Golovin and Krause for related problems, we give two approximation algorithms for Scenario Submodular Cover over discrete distributions. The first achieves an approximation factor of O(log Qm), where m is the size of the sample and Q is the goal utility. The second, simpler algorithm achieves an approximation bound of O(log QW), where Q is the goal utility and W is the sum of the integer weights. (Both bounds assume an integer-valued utility function.) Our results yield approximation bounds for other problems involving non-independent distributions that are explicitly specified by their support.
Tasks
Published	2016-03-10
URL	http://arxiv.org/abs/1603.03158v1
PDF	http://arxiv.org/pdf/1603.03158v1.pdf
PWC	https://paperswithcode.com/paper/scenario-submodular-cover
Repo
Framework