Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning

被引：2

作者：

Xiao, Yuchen ^{[1
]}

Lyu, Xueguang ^{[1
]}

Amato, Christopher ^{[1
]}

机构：

[1] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA

来源：

2021 INTERNATIONAL SYMPOSIUM ON MULTI-ROBOT AND MULTI-AGENT SYSTEMS (MRS) | 2021年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/MRS50823.2021.9620607

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Policy gradient methods have become popular in multi-agent reinforcement learning, but they suffer from high variance due to the presence of environmental stochasticity and exploring agents (i.e., non-stationarity), which is potentially worsened by the difficulty in credit assignment. As a result, there is a need for a method that is not only capable of efficiently solving the above two problems but also robust enough to solve a variety of tasks. To this end, we propose a new multi-agent policy gradient method, called Robust Local Advantage (ROLA) Actor-Critic. ROLA allows each agent to learn an individual action-value function as a local critic as well as ameliorating environment non-stationarity via a novel centralized training approach based on a centralized critic. By using this local critic, each agent calculates a baseline to reduce variance on its policy gradient estimation, which results in an expected advantage action-value over other agents' choices that implicitly improves credit assignment. We evaluate ROLA across diverse benchmarks and show its robustness and effectiveness over a number of state-of-the-art multi-agent policy gradient algorithms.

引用

页码：155 / 163

页数：9

共 50 条

[1] Actor-Critic Algorithms for Constrained Multi-agent Reinforcement Learning
Diddigi, Raghuram Bharadwaj
Reddy, D. Sai Koti
Prabuchandran, K. J.
Bhatnagar, Shalabh
[J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1931 - 1933
[2] Multi-Agent Natural Actor-Critic Reinforcement Learning Algorithms
Prashant Trivedi
Nandyala Hemachandra
[J]. Dynamic Games and Applications, 2023, 13 : 25 - 55
[3] Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning
Christianos, Filippos
Schafer, Lukas
Albrecht, Stefano V.
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[4] Multi-Agent Natural Actor-Critic Reinforcement Learning Algorithms
Trivedi, Prashant
Hemachandra, Nandyala
[J]. DYNAMIC GAMES AND APPLICATIONS, 2023, 13 (01) : 25 - 55
[5] A multi-agent reinforcement learning using Actor-Critic methods
Li, Chun-Gui
Wang, Meng
Yuan, Qing-Neng
[J]. PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 878 - 882
[6] Distributed Multi-Agent Reinforcement Learning by Actor-Critic Method
Heredia, Paulo C.
Mou, Shaoshuai
[J]. IFAC PAPERSONLINE, 2019, 52 (20): : 363 - 368
[7] Actor-Critic for Multi-Agent Reinforcement Learning with Self-Attention
Zhao, Juan
Zhu, Tong
Xiao, Shuo
Gao, Zongqian
Sun, Hao
[J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (09)
[8] Structural relational inference actor-critic for multi-agent reinforcement learning
Zhang, Xianjie
Liu, Yu
Xu, Xiujuan
Huang, Qiong
Mao, Hangyu
Carie, Anil
[J]. NEUROCOMPUTING, 2021, 459 : 383 - 394
[9] Multi-agent reinforcement learning by the actor-critic model with an attention interface
Zhang, Lixiang
Li, Jingchen
Zhu, Yi'an
Shi, Haobin
Hwang, Kao-Shing
[J]. NEUROCOMPUTING, 2022, 471 : 275 - 284
[10] Entropy regularized actor-critic based multi-agent deep reinforcement learning for stochastic games
Hao, Dong
Zhang, Dongcheng
Shi, Qi
Li, Kai
[J]. INFORMATION SCIENCES, 2022, 617 : 17 - 40

← 1 2 3 4 5 →