SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning

被引：0

作者：

Wen, Chao ^{[1
]}

Yao, Xinghu ^{[1
]}

Wang, Yuhui ^{[1
]}

Tan, Xiaoyang ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, MIIT Key Lab Pattern Anal & Machine Intelligence, Collaborat Innovat Ctr Novel Software Technol & I, Nanjing 211106, Peoples R China

来源：

THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2020年 / 34卷

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work presents a sample efficient and effective value-based method, named SMIX(lambda), for reinforcement learning in multi-agent environments (MARL) within the paradigm of centralized training with decentralized execution (CTDE), in which learning a stable and generalizable centralized value function (CVF) is crucial. To achieve this, our method carefully combines different elements, including 1) removing the unrealistic centralized greedy assumption during the learning phase, 2) using the lambda-return to balance the trade-off between bias and variance and to deal with the environment's non-Markovian property, and 3) adopting an experience-replay style off-policy training. Interestingly, it is revealed that there exists inherent connection between SMIX(lambda) and previous off-policy Q(lambda) approach for single-agent learning. Experiments on the StarCraft Multi-Agent Challenge (SMAC) benchmark show that the proposed SMIX(lambda) algorithm outperforms several state-of-the-art MARL methods by a large margin, and that it can be used as a general tool to improve the overall performance of a CTDE-type method by enhancing the evaluation quality of its CVF. We open-source our code at: https://github.com/chaovven/SMIX.

引用

页码：7301 / 7308

页数：8

共 50 条

[41] QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement learning
Son, Kyunghwan
Kim, Daewoo
Kang, Wan Ju
Hostallero, David
Yi, Yung
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[42] Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning
Zhou, Meng
Liu, Ziyu
Sui, Pengwei
Li, Yixuan
Chung, Yuk Ying
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[43] Certified Policy Smoothing for Cooperative Multi-Agent Reinforcement Learning
Mu, Ronghui
Ruan, Wenjie
Marcolino, Leandro Soriano
Jin, Gaojie
Ni, Qiang
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 12, 2023, : 15046 - 15054
[44] Cooperative Multi-Agent Deep Reinforcement Learning in Soccer Domains
Ocana, Jim Martin Catacora
Riccio, Francesco
Capobianco, Roberto
Nardi, Daniele
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1865 - 1867
[45] Cooperative targets assignment based on multi-agent reinforcement learning
Ma Y.
Wu L.
Xu X.
Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2023, 45 (09): : 2793 - 2801
[46] Transform networks for cooperative multi-agent deep reinforcement learning
Hongbin Wang
Xiaodong Xie
Lianke Zhou
Applied Intelligence, 2023, 53 : 9261 - 9269
[47] Reinforcement Learning Approach for Cooperative Control of Multi-Agent Systems
Javalera-Rincon, Valeria
Puig Cayuela, Vicenc
Morcego Seix, Bernardo
Orduna-Cabrera, Fernando
PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 80 - 91
[48] Reinforcement learning approaches to coordination in cooperative multi-agent systems
Kapetanakis, S
Kudenko, D
Strens, MJA
ADAPTIVE AGENTS AND MULTI-AGENT SYSTEMS: ADAPTATION AND MULTI-AGENT LEARNING, 2003, 2636 : 18 - 32
[49] Multi-Agent Evolutionary Reinforcement Learning Based on Cooperative Games
Yu, Jin
Zhang, Ya
Sun, Changyin
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
[50] Multi-Agent Reinforcement Learning for Cooperative Adaptive Cruise Control
Peake, Ashley
McCalmon, Joe
Raiford, Benjamin
Liu, Tongtong
Alqahtani, Sarra
2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 15 - 22

← 1 2 3 4 5 →