QDN: An Efficient Value Decomposition Method for Cooperative Multi-agent Deep Reinforcement Learning

被引:1
|
作者
Xie, Zaipeng [1 ,2 ]
Zhang, Yufeng [1 ,2 ]
Shao, Pengfei [1 ,2 ]
Zhao, Weiyi [3 ]
机构
[1] Hohai Univ, Key Lab Water Big Data Technol, Minist Water Resources, Nanjing, Peoples R China
[2] Hohai Univ, Coll Comp & Informat, Nanjing, Peoples R China
[3] Univ Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep reinforcement learning; cooperative multi-agent systems; convergence performance;
D O I
10.1109/ICTAI56018.2022.00183
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent systems have recently received significant attention from researchers in many scientific fields. The value factorization method is popular for scaling up cooperative reinforcement learning in multi-agent environments. However, the approximation of the joint value function may introduce a significant disparity between the estimated and actual joint reward value function, leading to a local optimum for cooperative multi-agent deep reinforcement learning. In addition, as the number of agents increases, the input space grows exponentially, negatively impacting the convergence performance of multi-agent algorithms. This work proposes an efficient multi-agent reinforcement learning algorithm, QDN, to enhance the convergence performance in cooperative multi-agent tasks. The proposed QDN scheme utilizes a competitive network to enable the agents to learn the value of the environmental state without the influence of actions. Hence, the error between the estimated joint reward value function and the actual joint reward value function can be significantly reduced, preventing the emergence of sub-optimal actions. Meanwhile, the proposed QDN algorithm utilizes the parametric noise on the network weights to introduce randomness in the network's weights so that the agents can explore the environments and states effectively, thereby improving the convergence performance of the QDN algorithm. We evaluate the proposed QDN scheme using the SMAC challenges with various map difficulties. Experimental results show that the QDN algorithm excels in the convergence speed and the success rate in all scenarios compared to some state-of-the-art methods. Further experiments using four additional multi-agent tasks demonstrate that the QDN algorithm is robust in various multi-agent tasks and can significantly improve the training convergence performance compared with the state-of-the-art methods.
引用
收藏
页码:1204 / 1211
页数:8
相关论文
共 50 条
  • [1] A review of cooperative multi-agent deep reinforcement learning
    Afshin Oroojlooy
    Davood Hajinezhad
    [J]. Applied Intelligence, 2023, 53 : 13677 - 13722
  • [2] A review of cooperative multi-agent deep reinforcement learning
    Oroojlooy, Afshin
    Hajinezhad, Davood
    [J]. APPLIED INTELLIGENCE, 2023, 53 (11) : 13677 - 13722
  • [3] Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
    Liu, Iou-Jen
    Jain, Unnat
    Yeh, Raymond A.
    Schwing, Alexander G.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [4] Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning
    Li, Jiahui
    Kuang, Kun
    Wang, Baoxiang
    Liu, Furui
    Chen, Long
    Fan, Changjie
    Wu, Fei
    Xiao, Jun
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [5] Privacy-Engineered Value Decomposition Networks for Cooperative Multi-Agent Reinforcement Learning
    Gohari, Parham
    Hale, Matthew
    Topcu, Ufuk
    [J]. 2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 8038 - 8044
  • [6] Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning
    Zohar, Roy
    Mannor, Shie
    Tennenholtz, Guy
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 9278 - 9285
  • [7] Optimistic Value Instructors for Cooperative Multi-Agent Reinforcement Learning
    Li, Chao
    Zhang, Yupeng
    Wang, Jianqi
    Hu, Yujing
    Dong, Shaokang
    Li, Wenbin
    Lv, Tangjie
    Fan, Changjie
    Gao, Yang
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17453 - 17460
  • [8] EFFICIENT MULTI-AGENT COOPERATIVE NAVIGATION IN UNKNOWN ENVIRONMENTS WITH INTERLACED DEEP REINFORCEMENT LEARNING
    Jin, Yue
    Zhang, Yaodong
    Yuan, Jian
    Zhang, Xudong
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2897 - 2901
  • [9] Cooperative Multi-Agent Deep Reinforcement Learning in Soccer Domains
    Ocana, Jim Martin Catacora
    Riccio, Francesco
    Capobianco, Roberto
    Nardi, Daniele
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1865 - 1867
  • [10] Transform networks for cooperative multi-agent deep reinforcement learning
    Hongbin Wang
    Xiaodong Xie
    Lianke Zhou
    [J]. Applied Intelligence, 2023, 53 : 9261 - 9269