An Efficient Centralized Multi-Agent Reinforcement Learner for Cooperative Tasks

被引:3
|
作者
Liao, Dengyu [1 ]
Zhang, Zhen [1 ]
Song, Tingting [2 ]
Liu, Mingyang [1 ]
机构
[1] Qingdao Univ, Sch Automat, Shandong Key Lab Ind Control Technol, Qingdao 266071, Peoples R China
[2] Qingdao Metro Grp Co Ltd, Operating Branch, Qingdao 266000, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-agent systems; Reinforcement learning; Multi-agent reinforcement learning; reinforcement learning; multi-agent system;
D O I
10.1109/ACCESS.2023.3340867
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-agent reinforcement learning (MARL) for cooperative tasks has been extensively researched over the past decade. The prevalent framework for MARL algorithms is centralized training and decentralized execution. Q-learning is often employed as a centralized learner. However, it requires finding the maximum value by comparing the Q-value of each joint action a' in the next state s' to update the Q-value of the last visited state-action pair (s,a). When the joint action space is extensive, the maximization operation involving comparisons becomes time-consuming and becomes the dominant computational burden of the algorithm. To tackle this issue, we propose an algorithm to reduce the number of comparisons by saving the joint actions with the top 2 Q-values (T2Q). Updating the top 2 Q-values involves seven cases, and the T2Q algorithm can avoid traversing the Q-table to update the Q-value in five of these seven cases, thus alleviating the computational burden. Theoretical analysis demonstrates that the upper bound of the expected ratio of comparisons between T2Q and Q-learning decreases as the number of agents increases. Simulation results from two-stage stochastic games are consistent with the theoretical analysis. Furthermore, the effectiveness of the T2Q algorithm is validated through the distributed sensor network task and the target transportation task. The T2Q algorithm successfully completes both tasks with a 100% success rate and minimal computational overhead.
引用
收藏
页码:139284 / 139294
页数:11
相关论文
共 50 条
  • [1] Centralized reinforcement learning for multi-agent cooperative environments
    Chengxuan Lu
    Qihao Bao
    Shaojie Xia
    Chongxiao Qu
    [J]. Evolutionary Intelligence, 2024, 17 : 267 - 273
  • [2] Centralized reinforcement learning for multi-agent cooperative environments
    Lu, Chengxuan
    Bao, Qihao
    Xia, Shaojie
    Qu, Chongxiao
    [J]. EVOLUTIONARY INTELLIGENCE, 2024, 17 (01) : 267 - 273
  • [3] Knowledge Reuse of Multi-Agent Reinforcement Learning in Cooperative Tasks
    Shi, Daming
    Tong, Junbo
    Liu, Yi
    Fan, Wenhui
    [J]. ENTROPY, 2022, 24 (04)
  • [4] WRFMR: A Multi-Agent Reinforcement Learning Method for Cooperative Tasks
    Liu, Hui
    Zhang, Zhen
    Wang, Dongqing
    [J]. IEEE ACCESS, 2020, 8 : 216320 - 216331
  • [5] MRRC: Multi-agent Reinforcement Learning with Rectification Capability in Cooperative Tasks
    Yu, Sheng
    Zhu, Wei
    Liu, Shuhong
    Gong, Zhengwen
    Chen, Haoran
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2023, PT II, 2024, 14448 : 204 - 218
  • [6] SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning
    Wen, Chao
    Yao, Xinghu
    Wang, Yuhui
    Tan, Xiaoyang
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7301 - 7308
  • [7] On Centralized Critics in Multi-Agent Reinforcement Learning
    Lyu, Xueguang
    Baisero, Andrea
    Xiao, Yuchen
    Daley, Brett
    Amato, Christopher
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2023, 77 : 295 - 354
  • [8] On Centralized Critics in Multi-Agent Reinforcement Learning
    Lyu, Xueguang
    Baisero, Andrea
    Xiao, Yuchen
    Daley, Brett
    Amato, Christopher
    [J]. Journal of Artificial Intelligence Research, 2023, 77 : 295 - 354
  • [9] Credit assignment in heterogeneous multi-agent reinforcement learning for fully cooperative tasks
    Jiang, Kun
    Liu, Wenzhang
    Wang, Yuanda
    Dong, Lu
    Sun, Changyin
    [J]. APPLIED INTELLIGENCE, 2023, 53 (23) : 29205 - 29222
  • [10] Credit assignment in heterogeneous multi-agent reinforcement learning for fully cooperative tasks
    Kun Jiang
    Wenzhang Liu
    Yuanda Wang
    Lu Dong
    Changyin Sun
    [J]. Applied Intelligence, 2023, 53 : 29205 - 29222