Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning

被引:19
|
作者
Ma, Aaron [1 ]
Ouimet, Michael [2 ]
Cortes, Jorge [1 ]
机构
[1] Univ Calif San Diego, Dept Mech & Aerosp Engn, La Jolla, CA 92093 USA
[2] Naval Informat Warfare Ctr Pacific, San Diego, CA USA
关键词
Reinforcement learning; Multi-agent planning; Distributed robotics; Semi-Markov decision processes; Markov decision processes; Upper confidence bound tree search; Hierarchical planning; Hierarchical Markov decision processes; Model-based reinforcement learning; Swarm robotics; Dynamic domain reduction; Submodularity; POMDPS;
D O I
10.1007/s10514-019-09871-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider scenarios where a swarm of unmanned vehicles (UxVs) seek to satisfy a number of diverse, spatially distributed objectives. The UxVs strive to determine an efficient plan to service the objectives while operating in a coordinated fashion. We focus on developing autonomous high-level planning, where low-level controls are leveraged from previous work in distributed motion, target tracking, localization, and communication. We rely on the use of state and action abstractions in a Markov decision processes framework to introduce a hierarchical algorithm, Dynamic Domain Reduction for Multi-Agent Planning, that enables multi-agent planning for large multi-objective environments. Our analysis establishes the correctness of our search procedure within specific subsets of the environments, termed 'sub-environment' and characterizes the algorithm performance with respect to the optimal trajectories in single-agent and sequential multi-agent deployment scenarios using tools from submodularity. Simulated results show significant improvement over using a standard Monte Carlo tree search in an environment with large state and action spaces.
引用
收藏
页码:485 / 503
页数:19
相关论文
共 50 条
  • [1] Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning
    Aaron Ma
    Michael Ouimet
    Jorge Cortés
    [J]. Autonomous Robots, 2020, 44 : 485 - 503
  • [2] Multi-Agent Hierarchical Reinforcement Learning with Dynamic Termination
    Han, Dongge
    Boehmer, Wendelin
    Wooldridge, Michael
    Rogers, Alex
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2006 - 2008
  • [3] Multi-agent Hierarchical Reinforcement Learning with Dynamic Termination
    Han, Dongge
    Bohmer, Wendelin
    Wooldridge, Michael
    Rogers, Alex
    [J]. PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2019, 11671 : 80 - 92
  • [4] Hierarchical multi-agent reinforcement learning
    Mohammad Ghavamzadeh
    Sridhar Mahadevan
    Rajbala Makar
    [J]. Autonomous Agents and Multi-Agent Systems, 2006, 13 : 197 - 229
  • [5] Hierarchical multi-agent reinforcement learning
    Ghavamzadeh, Mohammad
    Mahadevan, Sridhar
    Makar, Rajbala
    [J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2006, 13 (02) : 197 - 229
  • [6] Automating Feature Subspace Exploration via Multi-Agent Reinforcement Learning
    Liu, Kunpeng
    Fu, Yanjie
    Wang, Pengfei
    Wu, Le
    Bo, Rui
    Li, Xiaolin
    [J]. KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 207 - 215
  • [7] Multi-Agent Motion Planning for Dense and Dynamic Environments via Deep Reinforcement Learning
    Semnani, Samaneh Hosseini
    Liu, Hugh
    Everett, Michael
    de Ruiter, Anton
    How, Jonathan P.
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02): : 3221 - 3226
  • [8] Network Maintenance Planning Via Multi-Agent Reinforcement Learning
    Thomas, Jonathan
    Hernandez, Marco Perez
    Parlikad, Ajith Kumar
    Piechocki, Robert
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 2289 - 2295
  • [9] Mitigating Bus Bunching via Hierarchical Multi-Agent Reinforcement Learning
    Yu, Mengdi
    Yang, Tao
    Li, Chunxiao
    Jin, Yaohui
    Xu, Yanyan
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (08) : 9675 - 9692
  • [10] AdverSAR: Adversarial Search and Rescue via Multi-Agent Reinforcement Learning
    Rahman, Aowabin
    Bhattacharya, Arnab
    Ramachandran, Thiagarajan
    Mukherjee, Sayak
    Sharma, Himanshu
    Fujimoto, Ted
    Chatterjee, Samrat
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON TECHNOLOGIES FOR HOMELAND SECURITY (HST), 2022,