Decomposing Temporal Equilibrium Strategy for Coordinated Distributed Multi-Agent Reinforcement Learning

被引:0
|
作者
Zhu, Chenyang [1 ]
Si, Wen [1 ]
Zhu, Jinyu [1 ]
Jiang, Zhihao [2 ]
机构
[1] Changzhou Univ, Sch Comp Sci & Aritificial Intelligence, Changzhou, Jiangsu, Peoples R China
[2] ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
MARKOV DECISION-PROCESSES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The increasing demands for system complexity and robustness have prompted the integration of temporal logic into Multi-Agent Reinforcement Learning (MARL) to address tasks with non-Markovian properties. However, incorporating non-Markovian properties introduces additional computational complexities, as agents are required to integrate historical data into their decision-making process. Also, optimizing strategies within a multi-agent environment presents significant challenges due to the exponential growth of the state space with the number of agents. In this study, we introduce an innovative hierarchical MARL framework that synthesizes temporal equilibrium strategies through parity games and subsequently encodes them as individual reward machines for MARL coordination. More specifically, we reduce the strategy synthesis problem into an emptiness problem concerning parity games with optimized states and transitions. Following this synthesis step, the temporal equilibrium strategy is decomposed into individual reward machines for decentralized MARL. Theoretical proofs are provided to verify the consistency of the Nash equilibrium between the parallel composition of decomposed strategies and the original strategy. Empirical evidence confirms the efficacy of the proposed synthesis technique, showcasing its ability to reduce state space compared to the state-of-the-art tool. Furthermore, our study highlights the superior performance of the distributed MARL paradigm over centralized approaches when deploying decomposed strategies.
引用
收藏
页码:17618 / 17627
页数:10
相关论文
共 50 条
  • [31] A Multi-agent Reinforcement Learning Perspective on Distributed Traffic Engineering
    Geng, Nan
    Lan, Tian
    Aggarwal, Vaneet
    Yang, Yuan
    Xu, Mingwei
    2020 IEEE 28TH INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS (IEEE ICNP 2020), 2020,
  • [32] Distributed hierarchical reinforcement learning in multi-agent adversarial environments
    Naderializadeh, Navid
    Soleyman, Sean
    Hung, Fan
    Khosla, Deepak
    Chen, Yang
    Fadaie, Joshua G.
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS IV, 2022, 12113
  • [33] Reinforcement Learning for Multi-Agent Systems with Temporal Logic Specifications
    Terashima, Keita
    Kobayashi, Koichi
    Yamashita, Yuh
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2024, E107A (01) : 31 - 37
  • [34] Cooperative Optimization Strategy for Distributed Energy Resource System using Multi-Agent Reinforcement Learning
    Liu, Zhaoyang
    Xiang, Tianchun
    Wang, Tianhao
    Mu, Chaoxu
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [35] Multi-agent Deep Reinforcement Learning for Distributed Energy Management and Strategy Optimization of Microgrid Market
    Fang, Xiaohan
    Zhao, Qiang
    Wang, Jinkuan
    Han, Yinghua
    Li, Yuchun
    SUSTAINABLE CITIES AND SOCIETY, 2021, 74
  • [36] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43
  • [37] A Distributed Control Method for Urban Networks Using Multi-Agent Reinforcement Learning Based on Regional Mixed Strategy Nash-Equilibrium
    Qu, Zhaowei
    Pan, Zhaotian
    Chen, Yongheng
    Wang, Xin
    Li, Haitao
    IEEE ACCESS, 2020, 8 : 19750 - 19766
  • [38] QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus plus Innovations
    Kar, Soummya
    Moura, Jose M. F.
    Poor, H. Vincent
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2013, 61 (07) : 1848 - 1862
  • [39] DIFFER: Decomposing Individual Reward for Fair Experience Replay in Multi-Agent Reinforcement Learning
    Hu, Xunhan
    Zhao, Jian
    Zhou, Wengang
    Feng, Ruili
    Li, Houqiang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [40] Coordinated Multi-Agent Imitation Learning
    Le, Hoang M.
    Yue, Yisong
    Carr, Peter
    Lucey, Patrick
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70