October 19, 2019

2828 words 14 mins read

Paper Group ANR 251

Distributed Wildfire Surveillance with Autonomous Aircraft using Deep Reinforcement Learning. A First Experiment on Including Text Literals in KGloVe. Playing 20 Question Game with Policy-Based Reinforcement Learning. Analysis of Fast Alternating Minimization for Structured Dictionary Learning. A survey on Deep Learning Advances on Different 3D Dat …

Distributed Wildfire Surveillance with Autonomous Aircraft using Deep Reinforcement Learning


Title	Distributed Wildfire Surveillance with Autonomous Aircraft using Deep Reinforcement Learning
Authors	Kyle D. Julian, Mykel J. Kochenderfer
Abstract	Teams of autonomous unmanned aircraft can be used to monitor wildfires, enabling firefighters to make informed decisions. However, controlling multiple autonomous fixed-wing aircraft to maximize forest fire coverage is a complex problem. The state space is high dimensional, the fire propagates stochastically, the sensor information is imperfect, and the aircraft must coordinate with each other to accomplish their mission. This work presents two deep reinforcement learning approaches for training decentralized controllers that accommodate the high dimensionality and uncertainty inherent in the problem. The first approach controls the aircraft using immediate observations of the individual aircraft. The second approach allows aircraft to collaborate on a map of the wildfire’s state and maintain a time history of locations visited, which are used as inputs to the controller. Simulation results show that both approaches allow the aircraft to accurately track wildfire expansions and outperform an online receding horizon controller. Additional simulations demonstrate that the approach scales with different numbers of aircraft and generalizes to different wildfire shapes.
Tasks
Published	2018-10-09
URL	http://arxiv.org/abs/1810.04244v1
PDF	http://arxiv.org/pdf/1810.04244v1.pdf
PWC	https://paperswithcode.com/paper/distributed-wildfire-surveillance-with
Repo
Framework

A First Experiment on Including Text Literals in KGloVe


Title	A First Experiment on Including Text Literals in KGloVe
Authors	Michael Cochez, Martina Garofalo, Jérôme Lenßen, Maria Angela Pellegrino
Abstract	Graph embedding models produce embedding vectors for entities and relations in Knowledge Graphs, often without taking literal properties into account. We show an initial idea based on the combination of global graph structure with additional information provided by textual information in properties. Our initial experiment shows that this approach might be useful, but does not clearly outperform earlier approaches when evaluated on machine learning tasks.
Tasks	Graph Embedding, Knowledge Graphs
Published	2018-07-31
URL	http://arxiv.org/abs/1807.11761v1
PDF	http://arxiv.org/pdf/1807.11761v1.pdf
PWC	https://paperswithcode.com/paper/a-first-experiment-on-including-text-literals
Repo
Framework

Playing 20 Question Game with Policy-Based Reinforcement Learning


Title	Playing 20 Question Game with Policy-Based Reinforcement Learning
Authors	Huang Hu, Xianchao Wu, Bingfeng Luo, Chongyang Tao, Can Xu, Wei Wu, Zhan Chen
Abstract	The 20 Questions (Q20) game is a well known game which encourages deductive reasoning and creativity. In the game, the answerer first thinks of an object such as a famous person or a kind of animal. Then the questioner tries to guess the object by asking 20 questions. In a Q20 game system, the user is considered as the answerer while the system itself acts as the questioner which requires a good strategy of question selection to figure out the correct object and win the game. However, the optimal policy of question selection is hard to be derived due to the complexity and volatility of the game environment. In this paper, we propose a novel policy-based Reinforcement Learning (RL) method, which enables the questioner agent to learn the optimal policy of question selection through continuous interactions with users. To facilitate training, we also propose to use a reward network to estimate the more informative reward. Compared to previous methods, our RL method is robust to noisy answers and does not rely on the Knowledge Base of objects. Experimental results show that our RL method clearly outperforms an entropy-based engineering system and has competitive performance in a noisy-free simulation environment.
Tasks
Published	2018-08-23
URL	https://arxiv.org/abs/1808.07645v3
PDF	https://arxiv.org/pdf/1808.07645v3.pdf
PWC	https://paperswithcode.com/paper/playing-20-question-game-with-policy-based
Repo
Framework

Analysis of Fast Alternating Minimization for Structured Dictionary Learning


Title	Analysis of Fast Alternating Minimization for Structured Dictionary Learning
Authors	Saiprasad Ravishankar, Anna Ma, Deanna Needell
Abstract	Methods exploiting sparsity have been popular in imaging and signal processing applications including compression, denoising, and imaging inverse problems. Data-driven approaches such as dictionary learning and transform learning enable one to discover complex image features from datasets and provide promising performance over analytical models. Alternating minimization algorithms have been particularly popular in dictionary or transform learning. In this work, we study the properties of alternating minimization for structured (unitary) sparsifying operator learning. While the algorithm converges to the stationary points of the non-convex problem in general, we prove rapid local linear convergence to the underlying generative model under mild assumptions. Our experiments show that the unitary operator learning algorithm is robust to initialization.
Tasks	Denoising, Dictionary Learning
Published	2018-02-01
URL	http://arxiv.org/abs/1802.00518v1
PDF	http://arxiv.org/pdf/1802.00518v1.pdf
PWC	https://paperswithcode.com/paper/analysis-of-fast-alternating-minimization-for
Repo
Framework

A survey on Deep Learning Advances on Different 3D Data Representations


Title	A survey on Deep Learning Advances on Different 3D Data Representations
Authors	Eman Ahmed, Alexandre Saint, Abd El Rahman Shabayek, Kseniya Cherenkova, Rig Das, Gleb Gusev, Djamila Aouada, Bjorn Ottersten
Abstract	3D data is a valuable asset the computer vision filed as it provides rich information about the full geometry of sensed objects and scenes. Recently, with the availability of both large 3D datasets and computational power, it is today possible to consider applying deep learning to learn specific tasks on 3D data such as segmentation, recognition and correspondence. Depending on the considered 3D data representation, different challenges may be foreseen in using existent deep learning architectures. In this work, we provide a comprehensive overview about various 3D data representations highlighting the difference between Euclidean and non-Euclidean ones. We also discuss how Deep Learning methods are applied on each representation, analyzing the challenges to overcome.
Tasks
Published	2018-08-04
URL	http://arxiv.org/abs/1808.01462v2
PDF	http://arxiv.org/pdf/1808.01462v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-advances-on-different-3d-data
Repo
Framework

Content Based Image Retrieval from AWiFS Images Repository of IRS Resourcesat-2 Satellite Based on Water Bodies and Burnt Areas


Title	Content Based Image Retrieval from AWiFS Images Repository of IRS Resourcesat-2 Satellite Based on Water Bodies and Burnt Areas
Authors	Suraj Kothawade, Kunjan Mhaske, Sahil Sharma, Furkhan Shaikh
Abstract	Satellite Remote Sensing Technology is becoming a major milestone in the prediction of weather anomalies, natural disasters as well as finding alternative resources in proximity using multiple multi-spectral sensors emitting electromagnetic waves at distinct wavelengths. Hence, it is imperative to extract water bodies and burnt areas from orthorectified tiles and correspondingly rank them using similarity measures. Different objects in all the spheres of the earth have the inherent capability of absorbing electromagnetic waves of distant wavelengths. This creates various unique masks in terms of reflectance on the receptor. We propose Dynamic Semantic Segmentation (DSS) algorithms that utilized the mentioned capability to extract and rank Advanced Wide Field Sensor (AWiFS) images according to various features. This system stores data intelligently in the form of a sparse feature vector which drastically mitigates the computational and spatial costs incurred for further analysis. The compressed source image is divided into chunks and stored in the database for quicker retrieval. This work is intended to utilize readily available and cost effective resources like AWiFS dataset instead of depending on advanced technologies like Moderate Resolution Imaging Spectroradiometer (MODIS) for data which is scarce.
Tasks	Content-Based Image Retrieval, Image Retrieval, Semantic Segmentation
Published	2018-09-26
URL	http://arxiv.org/abs/1809.10190v1
PDF	http://arxiv.org/pdf/1809.10190v1.pdf
PWC	https://paperswithcode.com/paper/content-based-image-retrieval-from-awifs
Repo
Framework

Multispectral Image Intrinsic Decomposition via Low Rank Constraint


Title	Multispectral Image Intrinsic Decomposition via Low Rank Constraint
Authors	Qian Huang, Weixin Zhu, Yang Zhao, Linsen Chen, Yao Wang, Tao Yue, Xun Cao
Abstract	Multispectral images contain many clues of surface characteristics of the objects, thus can be widely used in many computer vision tasks, e.g., recolorization and segmentation. However, due to the complex illumination and the geometry structure of natural scenes, the spectra curves of a same surface can look very different. In this paper, a Low Rank Multispectral Image Intrinsic Decomposition model (LRIID) is presented to decompose the shading and reflectance from a single multispectral image. We extend the Retinex model, which is proposed for RGB image intrinsic decomposition, for multispectral domain. Based on this, a low rank constraint is proposed to reduce the ill-posedness of the problem and make the algorithm solvable. A dataset of 12 images is given with the ground truth of shadings and reflectance, so that the objective evaluations can be conducted. The experiments demonstrate the effectiveness of proposed method.
Tasks
Published	2018-02-24
URL	http://arxiv.org/abs/1802.08793v1
PDF	http://arxiv.org/pdf/1802.08793v1.pdf
PWC	https://paperswithcode.com/paper/multispectral-image-intrinsic-decomposition
Repo
Framework

Towards a Simple Approach to Multi-step Model-based Reinforcement Learning


Title	Towards a Simple Approach to Multi-step Model-based Reinforcement Learning
Authors	Kavosh Asadi, Evan Cater, Dipendra Misra, Michael L. Littman
Abstract	When environmental interaction is expensive, model-based reinforcement learning offers a solution by planning ahead and avoiding costly mistakes. Model-based agents typically learn a single-step transition model. In this paper, we propose a multi-step model that predicts the outcome of an action sequence with variable length. We show that this model is easy to learn, and that the model can make policy-conditional predictions. We report preliminary results that show a clear advantage for the multi-step model compared to its one-step counterpart.
Tasks
Published	2018-10-31
URL	http://arxiv.org/abs/1811.00128v1
PDF	http://arxiv.org/pdf/1811.00128v1.pdf
PWC	https://paperswithcode.com/paper/towards-a-simple-approach-to-multi-step-model
Repo
Framework

A Framework for Probabilistic Generic Traffic Scene Prediction


Title	A Framework for Probabilistic Generic Traffic Scene Prediction
Authors	Yeping Hu, Wei Zhan, Masayoshi Tomizuka
Abstract	In a given scenario, simultaneously and accurately predicting every possible interaction of traffic participants is an important capability for autonomous vehicles. The majority of current researches focused on the prediction of an single entity without incorporating the environment information. Although some approaches aimed to predict multiple vehicles, they either predicted each vehicle independently with no considerations on possible interaction with surrounding entities or generated discretized joint motions which cannot be directly used in decision making and motion planning for autonomous vehicle. In this paper, we present a probabilistic framework that is able to jointly predict continuous motions for multiple interacting road participants under any driving scenarios and is capable of forecasting the duration of each interaction, which can enhance the prediction performance and efficiency. The proposed traffic scene prediction framework contains two hierarchical modules: the upper module and the lower module. The upper module forecasts the intention of the predicted vehicle, while the lower module predicts motions for interacting scene entities. An exemplar real-world scenario is used to implement and examine the proposed framework.
Tasks	Autonomous Vehicles, Decision Making, Motion Planning
Published	2018-10-30
URL	http://arxiv.org/abs/1810.12506v1
PDF	http://arxiv.org/pdf/1810.12506v1.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-probabilistic-generic-traffic
Repo
Framework

Reinforcement Learning under Threats


Title	Reinforcement Learning under Threats
Authors	Victor Gallego, Roi Naveiro, David Rios Insua
Abstract	In several reinforcement learning (RL) scenarios, mainly in security settings, there may be adversaries trying to interfere with the reward generating process. In this paper, we introduce Threatened Markov Decision Processes (TMDPs), which provide a framework to support a decision maker against a potential adversary in RL. Furthermore, we propose a level-$k$ thinking scheme resulting in a new learning framework to deal with TMDPs. After introducing our framework and deriving theoretical results, relevant empirical evidence is given via extensive experiments, showing the benefits of accounting for adversaries while the agent learns.
Tasks
Published	2018-09-05
URL	https://arxiv.org/abs/1809.01560v2
PDF	https://arxiv.org/pdf/1809.01560v2.pdf
PWC	https://paperswithcode.com/paper/reinforcement-learning-under-threats
Repo
Framework

Average Convergence Rate of Evolutionary Algorithms II: Continuous Optimization


Title	Average Convergence Rate of Evolutionary Algorithms II: Continuous Optimization
Authors	Yu Chen, Jun He
Abstract	A good convergence metric must satisfy two requirements: feasible in calculation and rigorous in analysis. The average convergence rate is proposed as a new measurement for evaluating the convergence speed of evolutionary algorithms over consecutive generations. Its calculation is simple in practice and it is applicable to both continuous and discrete optimization. Previously a theoretical study of the average convergence rate was conducted for discrete optimization. This paper makes a further analysis for continuous optimization. First, the strategies of generating new solutions are classified into two categories: landscape-invariant and landscape-adaptive. Then, it is proven that the average convergence rate of evolutionary algorithms using landscape-invariant generators converges to zero, while the rate of algorithms using positive-adaptive generators has a positive limit. Finally, two case studies, the minimization problems of the two-dimensional sphere function and Rastrigin function, are made for demonstrating the applicability of the theory.
Tasks
Published	2018-10-27
URL	http://arxiv.org/abs/1810.11672v1
PDF	http://arxiv.org/pdf/1810.11672v1.pdf
PWC	https://paperswithcode.com/paper/average-convergence-rate-of-evolutionary
Repo
Framework

Auto-conditioned Recurrent Mixture Density Networks for Learning Generalizable Robot Skills


Title	Auto-conditioned Recurrent Mixture Density Networks for Learning Generalizable Robot Skills
Authors	Hejia Zhang, Eric Heiden, Stefanos Nikolaidis, Joseph J. Lim, Gaurav S. Sukhatme
Abstract	Personal robots assisting humans must perform complex manipulation tasks that are typically difficult to specify in traditional motion planning pipelines, where multiple objectives must be met and the high-level context be taken into consideration. Learning from demonstration (LfD) provides a promising way to learn these kind of complex manipulation skills even from non-technical users. However, it is challenging for existing LfD methods to efficiently learn skills that can generalize to task specifications that are not covered by demonstrations. In this paper, we introduce a state transition model (STM) that generates joint-space trajectories by imitating motions from expert behavior. Given a few demonstrations, we show in real robot experiments that the learned STM can quickly generalize to unseen tasks and synthesize motions having longer time horizons than the expert trajectories. Compared to conventional motion planners, our approach enables the robot to accomplish complex behaviors from high-level instructions without laborious hand-engineering of planning objectives, while being able to adapt to changing goals during the skill execution. In conjunction with a trajectory optimizer, our STM can construct a high-quality skeleton of a trajectory that can be further improved in smoothness and precision. In combination with a learned inverse dynamics model, we additionally present results where the STM is used as a high-level planner. A video of our experiments is available at https://youtu.be/85DX9Ojq-90
Tasks	Motion Planning
Published	2018-09-29
URL	http://arxiv.org/abs/1810.00146v3
PDF	http://arxiv.org/pdf/1810.00146v3.pdf
PWC	https://paperswithcode.com/paper/auto-conditioned-recurrent-mixture-density
Repo
Framework

Task-Agnostic Meta-Learning for Few-shot Learning


Title	Task-Agnostic Meta-Learning for Few-shot Learning
Authors	Muhammad Abdullah Jamal, Guo-Jun Qi, Mubarak Shah
Abstract	Meta-learning approaches have been proposed to tackle the few-shot learning problem.Typically, a meta-learner is trained on a variety of tasks in the hopes of being generalizable to new tasks. However, the generalizability on new tasks of a meta-learner could be fragile when it is over-trained on existing tasks during meta-training phase. In other words, the initial model of a meta-learner could be too biased towards existing tasks to adapt to new tasks, especially when only very few examples are available to update the model. To avoid a biased meta-learner and improve its generalizability, we propose a novel paradigm of Task-Agnostic Meta-Learning (TAML) algorithms. Specifically, we present an entropy-based approach that meta-learns an unbiased initial model with the largest uncertainty over the output labels by preventing it from over-performing in classification tasks. Alternatively, a more general inequality-minimization TAML is presented for more ubiquitous scenarios by directly minimizing the inequality of initial losses beyond the classification tasks wherever a suitable loss can be defined.Experiments on benchmarked datasets demonstrate that the proposed approaches outperform compared meta-learning algorithms in both few-shot classification and reinforcement learning tasks.
Tasks	Few-Shot Learning, Meta-Learning
Published	2018-05-20
URL	http://arxiv.org/abs/1805.07722v1
PDF	http://arxiv.org/pdf/1805.07722v1.pdf
PWC	https://paperswithcode.com/paper/task-agnostic-meta-learning-for-few-shot
Repo
Framework

Regularized Greedy Column Subset Selection


Title	Regularized Greedy Column Subset Selection
Authors	Bruno Ordozgoiti, Alberto Mozo, Jesús García López de Lacalle
Abstract	The Column Subset Selection Problem provides a natural framework for unsupervised feature selection. Despite being a hard combinatorial optimization problem, there exist efficient algorithms that provide good approximations. The drawback of the problem formulation is that it incorporates no form of regularization, and is therefore very sensitive to noise when presented with scarce data. In this paper we propose a regularized formulation of this problem, and derive a correct greedy algorithm that is similar in efficiency to existing greedy methods for the unregularized problem. We study its adequacy for feature selection and propose suitable formulations. Additionally, we derive a lower bound for the error of the proposed problems. Through various numerical experiments on real and synthetic data, we demonstrate the significantly increased robustness and stability of our method, as well as the improved conditioning of its output, all while remaining efficient for practical use.
Tasks	Combinatorial Optimization, Feature Selection
Published	2018-04-12
URL	http://arxiv.org/abs/1804.04421v1
PDF	http://arxiv.org/pdf/1804.04421v1.pdf
PWC	https://paperswithcode.com/paper/regularized-greedy-column-subset-selection
Repo
Framework

Automatic Event Salience Identification


Title	Automatic Event Salience Identification
Authors	Zhengzhong Liu, Chenyan Xiong, Teruko Mitamura, Eduard Hovy
Abstract	Identifying the salience (i.e. importance) of discourse units is an important task in language understanding. While events play important roles in text documents, little research exists on analyzing their saliency status. This paper empirically studies the Event Salience task and proposes two salience detection models based on content similarities and discourse relations. The first is a feature based salience model that incorporates similarities among discourse units. The second is a neural model that captures more complex relations between discourse units. Tested on our new large-scale event salience corpus, both methods significantly outperform the strong frequency baseline, while our neural model further improves the feature based one by a large margin. Our analyses demonstrate that our neural model captures interesting connections between salience and discourse unit relations (e.g., scripts and frame structures).
Tasks
Published	2018-09-03
URL	http://arxiv.org/abs/1809.00647v1
PDF	http://arxiv.org/pdf/1809.00647v1.pdf
PWC	https://paperswithcode.com/paper/automatic-event-salience-identification
Repo
Framework