Bias Estimation Correction in Multi-Agent Reinforcement Learning for Mixed Cooperative-Competitive Environments

被引:0
|
作者
Sarkar T. [1 ,2 ]
Kalita S. [1 ]
机构
[1] Department of Computer Science and Engineering, Tezpur University, Assam, Tezpur
[2] Department of MCA, MS Ramaiah Institute of Technology, Karnataka, Bengaluru
关键词
Bias estimation correction; MADDPG; MARL; MATD3; Weighted critic update;
D O I
10.1007/s42979-023-02326-7
中图分类号
学科分类号
摘要
Multi-agent reinforcement learning (MARL) is a domain that is being actively researched in the current times. The ability of MARL algorithms in finding promising solutions to problems while having limited prior knowledge of the environment has found application in traffic control, unmanned vehicles, routing, and more. Despite the algorithmic advancements over the years, the issue of bias estimation still persists in MARL methods. Multi-agent twin delayed deep deterministic (MATD3) method solves for bias overestimation problem that arises in Multi-agent deep deterministic policy gradient (MADDPG). MATD3 does so by using a double-critic architecture. MATD3 calculates the target values and updates the critic networks by taking a minimum of the Q-values estimated by the target Q-networks. The method solves the overestimation problem but in the process introduces an underestimation. A method is proposed in this paper to correct for both bias over-estimation and under-estimation problem by incorporating a triple critic architecture. The target values are then calculated using a weighted sum of the minimum and the median of the target Q-values. The proposed method is tested in the multi-agent particle environment (MPE). Experimental results show that our method, apart from correcting for the bias estimation, outperforms MATD3 and MADDPG methods in terms of overall returns. We also perform experiments to test the effect of a weighted critic update on our proposed method and have found propitious results. © 2023, The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd.
引用
收藏
相关论文
共 50 条
  • [21] Improved reinforcement learning in cooperative multi-agent environments using knowledge transfer
    Mahdavimoghadam, Mahnoosh
    Nikanjam, Amin
    Abdoos, Monireh
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (08): : 10455 - 10479
  • [22] Consensus Learning for Cooperative Multi-Agent Reinforcement Learning
    Xu, Zhiwei
    Zhang, Bin
    Li, Dapeng
    Zhang, Zeren
    Zhou, Guangchong
    Chen, Hao
    Fan, Guoliang
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 10, 2023, : 11726 - 11734
  • [23] A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising
    Wen, Chao
    Xu, Miao
    Zhang, Zhilin
    Zheng, Zhenzhe
    Wang, Yuhui
    Liu, Xiangyu
    Rong, Yu
    Xie, Dong
    Tan, Xiaoyang
    Yu, Chuan
    Xu, Jian
    Wu, Fan
    Chen, Guihai
    Zhu, Xiaoqiang
    Zheng, Bo
    WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 1129 - 1139
  • [24] AUTOTELIC REINFORCEMENT LEARNING IN MULTI-AGENT ENVIRONMENTS
    Nisioti, Eleni
    Masquil, Elias
    Hamon, Gautier
    Moulin-Frier, Clement
    CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232, 2023, 232 : 137 - 161
  • [25] Competitive-Cooperative Multi-Agent Reinforcement Learning for Auction-based Federated Learning
    Tang, Xiaoli
    Yu, Han
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 4262 - 4270
  • [26] Learning competitive pricing strategies by multi-agent reinforcement learning
    Kutschinski, E
    Uthmann, T
    Polani, D
    JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2003, 27 (11-12): : 2207 - 2218
  • [27] SAFE CONSENSUS CONTROL OF COOPERATIVE-COMPETITIVE MULTI-AGENT SYSTEMS VIA DIFFERENTIAL PRIVACY
    Ma, Jiayue
    Hu, Jiangping
    KYBERNETIKA, 2022, 58 (03) : 426 - 439
  • [28] Competitive Evolution Multi-Agent Deep Reinforcement Learning
    Zhou, Wenhong
    Chen, Yiting
    Li, Jie
    PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2019), 2019,
  • [29] EFFICIENT MULTI-AGENT COOPERATIVE NAVIGATION IN UNKNOWN ENVIRONMENTS WITH INTERLACED DEEP REINFORCEMENT LEARNING
    Jin, Yue
    Zhang, Yaodong
    Yuan, Jian
    Zhang, Xudong
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2897 - 2901
  • [30] Learning Cooperative Intrinsic Motivation in Multi-Agent Reinforcement Learning
    Hong, Seung-Jin
    Lee, Sang-Kwang
    12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1697 - 1699