Bias Estimation Correction in Multi-Agent Reinforcement Learning for Mixed Cooperative-Competitive Environments

被引:0
|
作者
Sarkar T. [1 ,2 ]
Kalita S. [1 ]
机构
[1] Department of Computer Science and Engineering, Tezpur University, Assam, Tezpur
[2] Department of MCA, MS Ramaiah Institute of Technology, Karnataka, Bengaluru
关键词
Bias estimation correction; MADDPG; MARL; MATD3; Weighted critic update;
D O I
10.1007/s42979-023-02326-7
中图分类号
学科分类号
摘要
Multi-agent reinforcement learning (MARL) is a domain that is being actively researched in the current times. The ability of MARL algorithms in finding promising solutions to problems while having limited prior knowledge of the environment has found application in traffic control, unmanned vehicles, routing, and more. Despite the algorithmic advancements over the years, the issue of bias estimation still persists in MARL methods. Multi-agent twin delayed deep deterministic (MATD3) method solves for bias overestimation problem that arises in Multi-agent deep deterministic policy gradient (MADDPG). MATD3 does so by using a double-critic architecture. MATD3 calculates the target values and updates the critic networks by taking a minimum of the Q-values estimated by the target Q-networks. The method solves the overestimation problem but in the process introduces an underestimation. A method is proposed in this paper to correct for both bias over-estimation and under-estimation problem by incorporating a triple critic architecture. The target values are then calculated using a weighted sum of the minimum and the median of the target Q-values. The proposed method is tested in the multi-agent particle environment (MPE). Experimental results show that our method, apart from correcting for the bias estimation, outperforms MATD3 and MADDPG methods in terms of overall returns. We also perform experiments to test the effect of a weighted critic update on our proposed method and have found propitious results. © 2023, The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd.
引用
下载
收藏
相关论文
共 50 条
  • [1] Hierarchical relationship modeling in multi-agent reinforcement learning for mixed cooperative-competitive environments
    Xie, Shaorong
    Li, Yang
    Wang, Xinzhi
    Zhang, Han
    Zhang, Zhenyu
    Luo, Xiangfeng
    Yu, Hang
    INFORMATION FUSION, 2024, 108
  • [2] Mixed Cooperative-Competitive Communication Using Multi-agent Reinforcement Learning
    Vanneste, Astrid
    Van Wijnsberghe, Wesley
    Vanneste, Simon
    Mets, Kevin
    Mercelis, Siegfried
    Latre, Steven
    Hellinckx, Peter
    ADVANCES ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING, 3PGCIC-2021, 2022, 343 : 197 - 206
  • [3] Scalable and Transferable Reinforcement Learning for Multi-Agent Mixed Cooperative-Competitive Environments Based on Hierarchical Graph Attention
    Chen, Yining
    Song, Guanghua
    Ye, Zhenhui
    Jiang, Xiaohong
    ENTROPY, 2022, 24 (04)
  • [4] Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
    Lowe, Ryan
    Wu, Yi
    Tamar, Aviv
    Harb, Jean
    Abbeel, Pieter
    Mordatch, Igor
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [5] Demand response model: A cooperative-competitive multi-agent reinforcement learning approach
    Salazar, Eduardo J.
    Rosero, Veronica
    Gabrielski, Jawana
    Samper, Mauricio E.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [6] Bottom-up multi-agent reinforcement learning by reward shaping for cooperative-competitive tasks
    Aotani, Takumi
    Kobayashi, Taisuke
    Sugimoto, Kenji
    APPLIED INTELLIGENCE, 2021, 51 (07) : 4434 - 4452
  • [7] Bottom-up multi-agent reinforcement learning by reward shaping for cooperative-competitive tasks
    Takumi Aotani
    Taisuke Kobayashi
    Kenji Sugimoto
    Applied Intelligence, 2021, 51 : 4434 - 4452
  • [8] A Novel Multi-Agent Parallel-Critic Network Architecture for Cooperative-Competitive Reinforcement Learning
    Sun, Yu
    Lai, Jun
    Cao, Lei
    Chen, Xiliang
    Xu, Zhixiong
    Xu, Yue
    IEEE ACCESS, 2020, 8 : 135605 - 135616
  • [9] Decentralized Counterfactual Value with Threat Detection for Multi-Agent Reinforcement Learning in mixed cooperative and competitive environments
    Dong, Shaokang
    Li, Chao
    Yang, Shangdong
    Li, Wenbin
    Gao, Yang
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 257
  • [10] Centralized reinforcement learning for multi-agent cooperative environments
    Chengxuan Lu
    Qihao Bao
    Shaojie Xia
    Chongxiao Qu
    Evolutionary Intelligence, 2024, 17 : 267 - 273