Multi-agent reinforcement learning as a rehearsal for decentralized planning

被引:183
|
作者
Kraemer, Landon [1 ]
Banerjee, Bikramjit [1 ]
机构
[1] Univ So Mississippi, Sch Comp, Hattiesburg, MS 39406 USA
基金
美国国家科学基金会;
关键词
Multi-agent reinforcement learning; Decentralized planning;
D O I
10.1016/j.neucom.2016.01.031
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Decentralized partially observable Markov decision processes (Dec-POMDPs) are a powerful tool for modeling multi-agent planning and decision-making under uncertainty. Prevalent Dec-POMDP solution techniques require centralized computation given full knowledge of the underlying model. Multi-agent reinforcement learning (MARL) based approaches have been recently proposed for distributed solution of Dec-POMDPs without full prior knowledge of the model, but these methods assume that conditions during learning and policy execution are identical. In some, practical scenarios this may not be the case. We propose a novel MARL approach in which agents are allowed to rehearse with information that will not be available during policy execution. The key is for the agents to learn policies that do not explicitly rely on these rehearsal features. We also establish a weak convergence result for our algorithm, RLaR, demonstrating that RLaR converges in probability when certain conditions are met. We show experimentally that incorporating rehearsal features can enhance the learning rate compared to non-rehearsal based learners, and demonstrate fast, (near) optimal performance on many existing benchmark Dec-POMDP problems. We also compare RLaR against an existing approximate Dec-POMDP solver which, like RLaR, does not assume a priori knowledge of the model. While RLaR's policy representation is not as scalable, we show that RLaR produces higher quality policies for most problems and horizons studied. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:82 / 94
页数:13
相关论文
共 50 条
  • [21] Decentralized Multi-agent Formation Control via Deep Reinforcement Learning
    Gutpa, Aniket
    Nallanthighal, Raghava
    ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2021, : 289 - 295
  • [22] Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning
    Qu, Chao
    Mannor, Shie
    Xu, Huan
    Qi, Yuan
    Song, Le
    Xiong, Junwu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [23] Provably Efficient Multi-Agent Reinforcement Learning with Fully Decentralized Communication
    Lidard, Justin
    Madhushani, Udari
    Leonard, Naomi Ehrich
    2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 3311 - 3316
  • [24] Automata Guided Semi-Decentralized Multi-Agent Reinforcement Learning
    Sun, Chuangchuang
    Li, Xiao
    Belta, Calin
    2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 3900 - 3905
  • [25] Network Maintenance Planning Via Multi-Agent Reinforcement Learning
    Thomas, Jonathan
    Hernandez, Marco Perez
    Parlikad, Ajith Kumar
    Piechocki, Robert
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 2289 - 2295
  • [26] Multi-agent reinforcement learning for planning and scheduling multiple goals
    Arai, S
    Sycara, K
    Payne, TR
    FOURTH INTERNATIONAL CONFERENCE ON MULTIAGENT SYSTEMS, PROCEEDINGS, 2000, : 359 - 360
  • [27] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43
  • [28] Decentralized Multi-agent Reinforcement Learning with Multi-time Scale of Decision Epochs
    Wu, Junjie
    Li, Kuo
    Jia, Qing-Shan
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 578 - 584
  • [29] Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
    Duc Thien Nguyen
    Yeoh, William
    Lau, Hoong Chuin
    Zilberstein, Shlomo
    Zhang, Chongjie
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1447 - 1455
  • [30] Decentralized Exploration of a Structured Environment Based on Multi-agent Deep Reinforcement Learning
    He, Dingjie
    Feng, Dawei
    Jia, Hongda
    Liu, Hui
    2020 IEEE 26TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2020, : 172 - 179