Decomposing Temporal Equilibrium Strategy for Coordinated Distributed Multi-Agent Reinforcement Learning

被引:0
|
作者
Zhu, Chenyang [1 ]
Si, Wen [1 ]
Zhu, Jinyu [1 ]
Jiang, Zhihao [2 ]
机构
[1] Changzhou Univ, Sch Comp Sci & Aritificial Intelligence, Changzhou, Jiangsu, Peoples R China
[2] ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
MARKOV DECISION-PROCESSES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The increasing demands for system complexity and robustness have prompted the integration of temporal logic into Multi-Agent Reinforcement Learning (MARL) to address tasks with non-Markovian properties. However, incorporating non-Markovian properties introduces additional computational complexities, as agents are required to integrate historical data into their decision-making process. Also, optimizing strategies within a multi-agent environment presents significant challenges due to the exponential growth of the state space with the number of agents. In this study, we introduce an innovative hierarchical MARL framework that synthesizes temporal equilibrium strategies through parity games and subsequently encodes them as individual reward machines for MARL coordination. More specifically, we reduce the strategy synthesis problem into an emptiness problem concerning parity games with optimized states and transitions. Following this synthesis step, the temporal equilibrium strategy is decomposed into individual reward machines for decentralized MARL. Theoretical proofs are provided to verify the consistency of the Nash equilibrium between the parallel composition of decomposed strategies and the original strategy. Empirical evidence confirms the efficacy of the proposed synthesis technique, showcasing its ability to reduce state space compared to the state-of-the-art tool. Furthermore, our study highlights the superior performance of the distributed MARL paradigm over centralized approaches when deploying decomposed strategies.
引用
收藏
页码:17618 / 17627
页数:10
相关论文
共 50 条
  • [21] Coordinated Ramp Metering Control Based on Multi-Agent Reinforcement Learning
    Tan, Jiyuan
    Qiu, Qianqian
    Guo, Weiwei
    2020 35TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2020, : 492 - 498
  • [22] Multi-Agent Deep Reinforcement Learning for Coordinated Multipoint in Mobile Networks
    Schneider, Stefan
    Karl, Holger
    Khalili, Ramin
    Hecker, Artur
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2024, 21 (01): : 908 - 924
  • [23] Feudal Latent Space Exploration for Coordinated Multi-Agent Reinforcement Learning
    Liu, Xiangyu
    Tan, Ying
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (10) : 7775 - 7783
  • [24] Analysis of coordinated behavior structures with multi-agent deep reinforcement learning
    Miyashita, Yuki
    Sugawara, Toshiharu
    APPLIED INTELLIGENCE, 2021, 51 (02) : 1069 - 1085
  • [25] Scalable Multi-Agent Reinforcement Learning for Dynamic Coordinated Multipoint Clustering
    Hu, Fenghe
    Deng, Yansha
    Hamid Aghvami, A.
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2023, 71 (01) : 101 - 114
  • [26] Analysis of coordinated behavior structures with multi-agent deep reinforcement learning
    Yuki Miyashita
    Toshiharu Sugawara
    Applied Intelligence, 2021, 51 : 1069 - 1085
  • [27] Multi-Agent Deep Reinforcement Learning for Distributed Load Restoration
    Linh Vu
    Tuyen Vu
    Thanh Long Vu
    Srivastava, Anurag
    IEEE TRANSACTIONS ON SMART GRID, 2024, 15 (02) : 1749 - 1760
  • [28] Distributed Inverse Constrained Reinforcement Learning for Multi-agent Systems
    Liu, Shicheng
    Zhu, Minghui
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [29] Towards a Distributed Framework for Multi-Agent Reinforcement Learning Research
    Zhou, Yutai
    Manuel, Shawn
    Morales, Peter
    Li, Sheng
    Pena, Jaime
    Allen, Ross
    2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,
  • [30] Multi-Agent Deep Reinforcement Learning for Distributed Satellite Routing
    Lozano-Cuadra, Federico
    Soret, Beatriz
    2024 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING FOR COMMUNICATION AND NETWORKING, ICMLCN 2024, 2024, : 554 - 555