Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning

被引：2

作者：

Xiao, Yuchen ^{[1
]}

Lyu, Xueguang ^{[1
]}

Amato, Christopher ^{[1
]}

机构：

[1] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA

来源：

2021 INTERNATIONAL SYMPOSIUM ON MULTI-ROBOT AND MULTI-AGENT SYSTEMS (MRS) | 2021年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/MRS50823.2021.9620607

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Policy gradient methods have become popular in multi-agent reinforcement learning, but they suffer from high variance due to the presence of environmental stochasticity and exploring agents (i.e., non-stationarity), which is potentially worsened by the difficulty in credit assignment. As a result, there is a need for a method that is not only capable of efficiently solving the above two problems but also robust enough to solve a variety of tasks. To this end, we propose a new multi-agent policy gradient method, called Robust Local Advantage (ROLA) Actor-Critic. ROLA allows each agent to learn an individual action-value function as a local critic as well as ameliorating environment non-stationarity via a novel centralized training approach based on a centralized critic. By using this local critic, each agent calculates a baseline to reduce variance on its policy gradient estimation, which results in an expected advantage action-value over other agents' choices that implicitly improves credit assignment. We evaluate ROLA across diverse benchmarks and show its robustness and effectiveness over a number of state-of-the-art multi-agent policy gradient algorithms.

引用

页码：155 / 163

页数：9

共 50 条

[41] A Prioritized objective actor-critic method for deep reinforcement learning
Nguyen, Ngoc Duy
Nguyen, Thanh Thi
Vamplew, Peter
Dazeley, Richard
Nahavandi, Saeid
[J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (16): : 10335 - 10349
[42] Improving sample efficiency in Multi-Agent Actor-Critic methods
Ye, Zhenhui
Chen, Yining
Jiang, Xiaohong
Song, Guanghua
Yang, Bowei
Fan, Sheng
[J]. APPLIED INTELLIGENCE, 2022, 52 (04) : 3691 - 3704
[43] Multi-agent actor-critic with time dynamical opponent model
Tian, Yuan
Kladny, Klaus -Rudolf
Wang, Qin
Huang, Zhiwu
Fink, Olga
[J]. NEUROCOMPUTING, 2023, 517 : 165 - 172
[44] Improving sample efficiency in Multi-Agent Actor-Critic methods
Zhenhui Ye
Yining Chen
Xiaohong Jiang
Guanghua Song
Bowei Yang
Sheng Fan
[J]. Applied Intelligence, 2022, 52 : 3691 - 3704
[45] Multi-Agent Actor-Critic with Hierarchical Graph Attention Network
Ryu, Heechang
Shin, Hayong
Park, Jinkyoo
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7236 - 7243
[46] A Parallel Approach to Advantage Actor Critic in Deep Reinforcement Learning
Zhu, Xing
Du, Yunfei
[J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2019, PT II, 2020, 11945 : 320 - 327
[47] An actor-critic algorithm for multi-agent learning in queue-based stochastic games
Sundar, D. Krishna
Ravikumar, K.
[J]. NEUROCOMPUTING, 2014, 127 : 258 - 265
[48] An Object Oriented Approach to Fuzzy Actor-Critic Learning for Multi-Agent Differential Games
Schwartz, Howard
[J]. 2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 183 - 190
[49] Lexicographic Actor-Critic Deep Reinforcement Learning for Urban Autonomous Driving
Zhang, Hengrui
Lin, Youfang
Han, Sheng
Lv, Kai
[J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (04) : 4308 - 4319
[50] Deep Reinforcement Learning in VizDoom via DQN and Actor-Critic Agents
Bakhanova, Maria
Makarov, Ilya
[J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2021, PT I, 2021, 12861 : 138 - 150

← 1 2 3 4 5 →