Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning

被引:2
|
作者
Xiao, Yuchen [1 ]
Lyu, Xueguang [1 ]
Amato, Christopher [1 ]
机构
[1] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/MRS50823.2021.9620607
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Policy gradient methods have become popular in multi-agent reinforcement learning, but they suffer from high variance due to the presence of environmental stochasticity and exploring agents (i.e., non-stationarity), which is potentially worsened by the difficulty in credit assignment. As a result, there is a need for a method that is not only capable of efficiently solving the above two problems but also robust enough to solve a variety of tasks. To this end, we propose a new multi-agent policy gradient method, called Robust Local Advantage (ROLA) Actor-Critic. ROLA allows each agent to learn an individual action-value function as a local critic as well as ameliorating environment non-stationarity via a novel centralized training approach based on a centralized critic. By using this local critic, each agent calculates a baseline to reduce variance on its policy gradient estimation, which results in an expected advantage action-value over other agents' choices that implicitly improves credit assignment. We evaluate ROLA across diverse benchmarks and show its robustness and effectiveness over a number of state-of-the-art multi-agent policy gradient algorithms.
引用
收藏
页码:155 / 163
页数:9
相关论文
共 50 条
  • [41] A Prioritized objective actor-critic method for deep reinforcement learning
    Nguyen, Ngoc Duy
    Nguyen, Thanh Thi
    Vamplew, Peter
    Dazeley, Richard
    Nahavandi, Saeid
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (16): : 10335 - 10349
  • [42] Improving sample efficiency in Multi-Agent Actor-Critic methods
    Ye, Zhenhui
    Chen, Yining
    Jiang, Xiaohong
    Song, Guanghua
    Yang, Bowei
    Fan, Sheng
    [J]. APPLIED INTELLIGENCE, 2022, 52 (04) : 3691 - 3704
  • [43] Multi-agent actor-critic with time dynamical opponent model
    Tian, Yuan
    Kladny, Klaus -Rudolf
    Wang, Qin
    Huang, Zhiwu
    Fink, Olga
    [J]. NEUROCOMPUTING, 2023, 517 : 165 - 172
  • [44] Improving sample efficiency in Multi-Agent Actor-Critic methods
    Zhenhui Ye
    Yining Chen
    Xiaohong Jiang
    Guanghua Song
    Bowei Yang
    Sheng Fan
    [J]. Applied Intelligence, 2022, 52 : 3691 - 3704
  • [45] Multi-Agent Actor-Critic with Hierarchical Graph Attention Network
    Ryu, Heechang
    Shin, Hayong
    Park, Jinkyoo
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7236 - 7243
  • [46] A Parallel Approach to Advantage Actor Critic in Deep Reinforcement Learning
    Zhu, Xing
    Du, Yunfei
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2019, PT II, 2020, 11945 : 320 - 327
  • [47] An actor-critic algorithm for multi-agent learning in queue-based stochastic games
    Sundar, D. Krishna
    Ravikumar, K.
    [J]. NEUROCOMPUTING, 2014, 127 : 258 - 265
  • [48] An Object Oriented Approach to Fuzzy Actor-Critic Learning for Multi-Agent Differential Games
    Schwartz, Howard
    [J]. 2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 183 - 190
  • [49] Lexicographic Actor-Critic Deep Reinforcement Learning for Urban Autonomous Driving
    Zhang, Hengrui
    Lin, Youfang
    Han, Sheng
    Lv, Kai
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (04) : 4308 - 4319
  • [50] Deep Reinforcement Learning in VizDoom via DQN and Actor-Critic Agents
    Bakhanova, Maria
    Makarov, Ilya
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2021, PT I, 2021, 12861 : 138 - 150