SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning

被引：0

作者：

Wen, Chao ^{[1
]}

Yao, Xinghu ^{[1
]}

Wang, Yuhui ^{[1
]}

Tan, Xiaoyang ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, MIIT Key Lab Pattern Anal & Machine Intelligence, Collaborat Innovat Ctr Novel Software Technol & I, Nanjing 211106, Peoples R China

来源：

THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2020年 / 34卷

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work presents a sample efficient and effective value-based method, named SMIX(lambda), for reinforcement learning in multi-agent environments (MARL) within the paradigm of centralized training with decentralized execution (CTDE), in which learning a stable and generalizable centralized value function (CVF) is crucial. To achieve this, our method carefully combines different elements, including 1) removing the unrealistic centralized greedy assumption during the learning phase, 2) using the lambda-return to balance the trade-off between bias and variance and to deal with the environment's non-Markovian property, and 3) adopting an experience-replay style off-policy training. Interestingly, it is revealed that there exists inherent connection between SMIX(lambda) and previous off-policy Q(lambda) approach for single-agent learning. Experiments on the StarCraft Multi-Agent Challenge (SMAC) benchmark show that the proposed SMIX(lambda) algorithm outperforms several state-of-the-art MARL methods by a large margin, and that it can be used as a general tool to improve the overall performance of a CTDE-type method by enhancing the evaluation quality of its CVF. We open-source our code at: https://github.com/chaovven/SMIX.

引用

页码：7301 / 7308

页数：8

共 50 条

[1] SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multiagent Reinforcement Learning
Yao, Xinghu
Wen, Chao
Wang, Yuhui
Tan, Xiaoyang
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (01) : 52 - 63
[2] Centralized reinforcement learning for multi-agent cooperative environments
Chengxuan Lu
Qihao Bao
Shaojie Xia
Chongxiao Qu
[J]. Evolutionary Intelligence, 2024, 17 : 267 - 273
[3] Centralized reinforcement learning for multi-agent cooperative environments
Lu, Chengxuan
Bao, Qihao
Xia, Shaojie
Qu, Chongxiao
[J]. EVOLUTIONARY INTELLIGENCE, 2024, 17 (01) : 267 - 273
[4] Optimistic Value Instructors for Cooperative Multi-Agent Reinforcement Learning
Li, Chao
Zhang, Yupeng
Wang, Jianqi
Hu, Yujing
Dong, Shaokang
Li, Wenbin
Lv, Tangjie
Fan, Changjie
Gao, Yang
[J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17453 - 17460
[5] On Centralized Critics in Multi-Agent Reinforcement Learning
Lyu, Xueguang
Baisero, Andrea
Xiao, Yuchen
Daley, Brett
Amato, Christopher
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2023, 77 : 295 - 354
[6] On Centralized Critics in Multi-Agent Reinforcement Learning
Lyu, Xueguang
Baisero, Andrea
Xiao, Yuchen
Daley, Brett
Amato, Christopher
[J]. Journal of Artificial Intelligence Research, 2023, 77 : 295 - 354
[7] An Efficient Centralized Multi-Agent Reinforcement Learner for Cooperative Tasks
Liao, Dengyu
Zhang, Zhen
Song, Tingting
Liu, Mingyang
[J]. IEEE ACCESS, 2023, 11 : 139284 - 139294
[8] Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning
Chen, Hao
Yang, Guangkai
Zhang, Junge
Yin, Qiyue
Huang, Kaiqi
[J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[9] On the Robustness of Cooperative Multi-Agent Reinforcement Learning
Lin, Jieyu
Dzeparoska, Kristina
Zhang, Sai Qian
Leon-Garcia, Alberto
Papernot, Nicolas
[J]. 2020 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2020), 2020, : 62 - 68
[10] Consensus Learning for Cooperative Multi-Agent Reinforcement Learning
Xu, Zhiwei
Zhang, Bin
Li, Dapeng
Zhang, Zeren
Zhou, Guangchong
Chen, Hao
Fan, Guoliang
[J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 10, 2023, : 11726 - 11734

← 1 2 3 4 5 →