QDN: An Efficient Value Decomposition Method for Cooperative Multi-agent Deep Reinforcement Learning

被引:1
|
作者
Xie, Zaipeng [1 ,2 ]
Zhang, Yufeng [1 ,2 ]
Shao, Pengfei [1 ,2 ]
Zhao, Weiyi [3 ]
机构
[1] Hohai Univ, Key Lab Water Big Data Technol, Minist Water Resources, Nanjing, Peoples R China
[2] Hohai Univ, Coll Comp & Informat, Nanjing, Peoples R China
[3] Univ Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep reinforcement learning; cooperative multi-agent systems; convergence performance;
D O I
10.1109/ICTAI56018.2022.00183
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent systems have recently received significant attention from researchers in many scientific fields. The value factorization method is popular for scaling up cooperative reinforcement learning in multi-agent environments. However, the approximation of the joint value function may introduce a significant disparity between the estimated and actual joint reward value function, leading to a local optimum for cooperative multi-agent deep reinforcement learning. In addition, as the number of agents increases, the input space grows exponentially, negatively impacting the convergence performance of multi-agent algorithms. This work proposes an efficient multi-agent reinforcement learning algorithm, QDN, to enhance the convergence performance in cooperative multi-agent tasks. The proposed QDN scheme utilizes a competitive network to enable the agents to learn the value of the environmental state without the influence of actions. Hence, the error between the estimated joint reward value function and the actual joint reward value function can be significantly reduced, preventing the emergence of sub-optimal actions. Meanwhile, the proposed QDN algorithm utilizes the parametric noise on the network weights to introduce randomness in the network's weights so that the agents can explore the environments and states effectively, thereby improving the convergence performance of the QDN algorithm. We evaluate the proposed QDN scheme using the SMAC challenges with various map difficulties. Experimental results show that the QDN algorithm excels in the convergence speed and the success rate in all scenarios compared to some state-of-the-art methods. Further experiments using four additional multi-agent tasks demonstrate that the QDN algorithm is robust in various multi-agent tasks and can significantly improve the training convergence performance compared with the state-of-the-art methods.
引用
收藏
页码:1204 / 1211
页数:8
相关论文
共 50 条
  • [31] Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning
    Qu, Chao
    Mannor, Shie
    Xu, Huan
    Qi, Yuan
    Song, Le
    Xiong, Junwu
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [32] Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
    Rashid, Tabish
    Samvelyan, Mikayel
    de Witt, Christian Schroeder
    Farquhar, Gregory
    Foerster, Jakob
    Whiteson, Shimon
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [33] Learning MAC Protocols in HetNets: A Cooperative Multi-Agent Deep Reinforcement Learning Approach
    Naeem, Faisal
    Adam, Nadir
    Kaddoum, Georges
    Waqar, Omer
    [J]. 2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [34] SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning
    Wen, Chao
    Yao, Xinghu
    Wang, Yuhui
    Tan, Xiaoyang
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7301 - 7308
  • [35] Multi-Agent Deep Reinforcement Learning for Cooperative Driving in Crowded Traffic Scenarios
    Park, Jongwon
    Min, Kyushik
    Huh, Kunsoo
    [J]. 2019 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS), 2019,
  • [36] Cooperative Multi-Agent Deep Reinforcement Learning for Dynamic Virtual Network Allocation
    Suzuki, Akito
    Kawahara, Ryoichi
    Harada, Shigeaki
    [J]. 30TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS (ICCCN 2021), 2021,
  • [37] Multi-Agent Deep Reinforcement Learning for Decentralized Cooperative Traffic Signal Control
    Zhao, Yang
    Hu, Jian-Ming
    Gao, Ming-Yang
    Zhang, Zuo
    [J]. CICTP 2020: TRANSPORTATION EVOLUTION IMPACTING FUTURE MOBILITY, 2020, : 458 - 470
  • [38] HALFTONING WITH MULTI-AGENT DEEP REINFORCEMENT LEARNING
    Jiang, Haitian
    Xiong, Dongliang
    Jiang, Xiaowen
    Yin, Aiguo
    Ding, Li
    Huang, Kai
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 641 - 645
  • [39] Deep reinforcement learning for multi-agent interaction
    Ahmed, Ibrahim H.
    Brewitt, Cillian
    Carlucho, Ignacio
    Christianos, Filippos
    Dunion, Mhairi
    Fosong, Elliot
    Garcin, Samuel
    Guo, Shangmin
    Gyevnar, Balint
    McInroe, Trevor
    Papoudakis, Georgios
    Rahman, Arrasy
    Schafer, Lukas
    Tamborski, Massimiliano
    Vecchio, Giuseppe
    Wang, Cheng
    Albrecht, Stefano, V
    [J]. AI COMMUNICATIONS, 2022, 35 (04) : 357 - 368
  • [40] Multi-agent deep reinforcement learning: a survey
    Sven Gronauer
    Klaus Diepold
    [J]. Artificial Intelligence Review, 2022, 55 : 895 - 943