Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning

被引:2
|
作者
Xiao, Yuchen [1 ]
Lyu, Xueguang [1 ]
Amato, Christopher [1 ]
机构
[1] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/MRS50823.2021.9620607
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Policy gradient methods have become popular in multi-agent reinforcement learning, but they suffer from high variance due to the presence of environmental stochasticity and exploring agents (i.e., non-stationarity), which is potentially worsened by the difficulty in credit assignment. As a result, there is a need for a method that is not only capable of efficiently solving the above two problems but also robust enough to solve a variety of tasks. To this end, we propose a new multi-agent policy gradient method, called Robust Local Advantage (ROLA) Actor-Critic. ROLA allows each agent to learn an individual action-value function as a local critic as well as ameliorating environment non-stationarity via a novel centralized training approach based on a centralized critic. By using this local critic, each agent calculates a baseline to reduce variance on its policy gradient estimation, which results in an expected advantage action-value over other agents' choices that implicitly improves credit assignment. We evaluate ROLA across diverse benchmarks and show its robustness and effectiveness over a number of state-of-the-art multi-agent policy gradient algorithms.
引用
收藏
页码:155 / 163
页数:9
相关论文
共 50 条
  • [1] Actor-Critic Algorithms for Constrained Multi-agent Reinforcement Learning
    Diddigi, Raghuram Bharadwaj
    Reddy, D. Sai Koti
    Prabuchandran, K. J.
    Bhatnagar, Shalabh
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1931 - 1933
  • [2] Multi-Agent Natural Actor-Critic Reinforcement Learning Algorithms
    Prashant Trivedi
    Nandyala Hemachandra
    [J]. Dynamic Games and Applications, 2023, 13 : 25 - 55
  • [3] Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning
    Christianos, Filippos
    Schafer, Lukas
    Albrecht, Stefano V.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [4] Multi-Agent Natural Actor-Critic Reinforcement Learning Algorithms
    Trivedi, Prashant
    Hemachandra, Nandyala
    [J]. DYNAMIC GAMES AND APPLICATIONS, 2023, 13 (01) : 25 - 55
  • [5] A multi-agent reinforcement learning using Actor-Critic methods
    Li, Chun-Gui
    Wang, Meng
    Yuan, Qing-Neng
    [J]. PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 878 - 882
  • [6] Distributed Multi-Agent Reinforcement Learning by Actor-Critic Method
    Heredia, Paulo C.
    Mou, Shaoshuai
    [J]. IFAC PAPERSONLINE, 2019, 52 (20): : 363 - 368
  • [7] Actor-Critic for Multi-Agent Reinforcement Learning with Self-Attention
    Zhao, Juan
    Zhu, Tong
    Xiao, Shuo
    Gao, Zongqian
    Sun, Hao
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (09)
  • [8] Structural relational inference actor-critic for multi-agent reinforcement learning
    Zhang, Xianjie
    Liu, Yu
    Xu, Xiujuan
    Huang, Qiong
    Mao, Hangyu
    Carie, Anil
    [J]. NEUROCOMPUTING, 2021, 459 : 383 - 394
  • [9] Multi-agent reinforcement learning by the actor-critic model with an attention interface
    Zhang, Lixiang
    Li, Jingchen
    Zhu, Yi'an
    Shi, Haobin
    Hwang, Kao-Shing
    [J]. NEUROCOMPUTING, 2022, 471 : 275 - 284
  • [10] Entropy regularized actor-critic based multi-agent deep reinforcement learning for stochastic games
    Hao, Dong
    Zhang, Dongcheng
    Shi, Qi
    Li, Kai
    [J]. INFORMATION SCIENCES, 2022, 617 : 17 - 40