On Centralized Critics in Multi-Agent Reinforcement Learning

被引:0
|
作者
Lyu, Xueguang [1 ]
Baisero, Andrea [1 ]
Xiao, Yuchen [1 ]
Daley, Brett [1 ]
Amato, Christopher [1 ]
机构
[1] Northeastern University, Khoury College of Computer Sciences, 360 Huntington Avenue, Boston,MA,02115, United States
基金
美国国家科学基金会;
关键词
Fertilizers - Multi agent systems;
D O I
暂无
中图分类号
学科分类号
摘要
Centralized Training for Decentralized Execution, where agents are trained offline in a centralized fashion and execute online in a decentralized manner, has become a popular approach in Multi-Agent Reinforcement Learning (MARL). In particular, it has become popular to develop actor-critic methods that train decentralized actors with a centralized critic where the centralized critic is allowed access global information of the entire system, including the true system state. Such centralized critics are possible given offline information and are not used for online execution. While these methods perform well in a number of domains and have become a de facto standard in MARL, using a centralized critic in this context has yet to be sufficiently analyzed theoretically or empirically. In this paper, we therefore formally analyze centralized and decentralized critic approaches, and analyze the effect of using state-based critics in partially observable environments. We derive theories contrary to the common intuition: critic centralization is not strictly beneficial, and using state values can be harmful. We further prove that, in particular, state-based critics can introduce unexpected bias and variance compared to history-based critics. Finally, we demonstrate how the theory applies in practice by comparing different forms of critics on a wide range of common multi-agent benchmarks. The experiments show practical issues such as the difficulty of representation learning with partial observability, which highlights why the theoretical problems are often overlooked in the literature. © 2023 The Authors.
引用
收藏
页码:295 / 354
相关论文
共 50 条
  • [1] On Centralized Critics in Multi-Agent Reinforcement Learning
    Lyu, Xueguang
    Baisero, Andrea
    Xiao, Yuchen
    Daley, Brett
    Amato, Christopher
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2023, 77 : 295 - 354
  • [2] Centralized reinforcement learning for multi-agent cooperative environments
    Chengxuan Lu
    Qihao Bao
    Shaojie Xia
    Chongxiao Qu
    [J]. Evolutionary Intelligence, 2024, 17 : 267 - 273
  • [3] Centralized reinforcement learning for multi-agent cooperative environments
    Lu, Chengxuan
    Bao, Qihao
    Xia, Shaojie
    Qu, Chongxiao
    [J]. EVOLUTIONARY INTELLIGENCE, 2024, 17 (01) : 267 - 273
  • [4] Exploring communication protocols and centralized critics in multi-agent deep learning
    Simoes, David
    Lau, Nuno
    Reis, Luis Paulo
    [J]. INTEGRATED COMPUTER-AIDED ENGINEERING, 2020, 27 (04) : 333 - 351
  • [5] A centralized reinforcement learning method for multi-agent job scheduling in Grid
    Moradi, Milad
    [J]. 2016 6TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2016, : 171 - 176
  • [6] MAIDRL: Semi-centralized Multi-Agent Reinforcement Learning using Agent Influence
    Harris, Anthony
    Liu, Siming
    [J]. 2021 IEEE CONFERENCE ON GAMES (COG), 2021, : 388 - 395
  • [7] A Deeper Understanding of State-Based Critics in Multi-Agent Reinforcement Learning
    Lyu, Xueguang
    Baisero, Andrea
    Xiao, Yuchen
    Amato, Christopher
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 9396 - 9404
  • [8] Survey of Recent Multi-Agent Reinforcement Learning Algorithms Utilizing Centralized Training
    Sharma, Piyush K.
    Zaroukian, Erin G.
    Fernandez, Rolando
    Basak, Anjon
    Asher, Derrik E.
    Dorothy, Michael
    [J]. ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS III, 2021, 11746
  • [9] Cournot Policy Model: Rethinking centralized training in multi-agent reinforcement learning
    Li, Jingchen
    Yang, Yusen
    He, Ziming
    Wu, Huarui
    Shi, Haobin
    Chen, Wenbai
    [J]. INFORMATION SCIENCES, 2024, 677
  • [10] SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning
    Wen, Chao
    Yao, Xinghu
    Wang, Yuhui
    Tan, Xiaoyang
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7301 - 7308