A distributed adaptive policy gradient method based on momentum for multi-agent reinforcement learning

被引:0
|
作者
Shi, Junru [1 ]
Wang, Xin [2 ]
Zhang, Mingchuan [1 ]
Liu, Muhua [1 ]
Zhu, Junlong [1 ]
Wu, Qingtao [1 ]
机构
[1] Henan Univ Sci & Technol, Sch Informat Engn, Luoyang 471023, Peoples R China
[2] Shanghai Int Studies Univ, Sch Business & Management, Shanghai 200083, Peoples R China
基金
中国国家自然科学基金;
关键词
Distributed reinforcement learning; Importance sampling; Momentum; Policy gradient methods; Variance reduction; ALGORITHMS;
D O I
10.1007/s40747-024-01529-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Policy Gradient (PG) method is one of the most popular algorithms in Reinforcement Learning (RL). However, distributed adaptive variants of PG are rarely studied in multi-agent. For this reason, this paper proposes a distributed adaptive policy gradient algorithm (IS-DAPGM) incorporated with Adam-type updates and importance sampling technique. Furthermore, we also establish the theoretical convergence rate of O(1/T)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(1/\sqrt{T})$$\end{document}, where T represents the number of iterations, it can match the convergence rate of the state-of-the-art centralized policy gradient methods. In addition, many experiments are conducted in a multi-agent environment, which is a modification on the basis of Particle world environment. By comparing with some other distributed PG methods and changing the number of agents, we verify the performance of IS-DAPGM is more efficient than the existing methods.
引用
收藏
页码:7297 / 7310
页数:14
相关论文
共 50 条
  • [1] A Deep Reinforcement Learning Method based on Deterministic Policy Gradient for Multi-Agent Cooperative Competition
    Zuo, Xuan
    Xue, Hui-Feng
    Wang, Xiao-Yin
    Du, Wan-Ru
    Tian, Tao
    Gao, Shan
    Zhang, Pu
    CONTROL ENGINEERING AND APPLIED INFORMATICS, 2021, 23 (03): : 88 - 98
  • [2] QSOD: Hybrid Policy Gradient for Deep Multi-agent Reinforcement Learning
    Rehman, Hafiz Muhammad Raza Ur
    On, Byung-Won
    Ningombam, Devarani Devi
    Yi, Sungwon
    Choi, Gyu Sang
    IEEE ACCESS, 2021, 9 : 129728 - 129741
  • [3] Learning Distributed Coordinated Policy in Catching Game with Multi-Agent Reinforcement Learning
    Liu, Xiangyu
    Tan, Ying
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [4] Multi-agent Gradient-Based Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning
    Ren, Jineng
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
  • [5] Multi-Agent Reinforcement Learning With Distributed Targeted Multi-Agent Communication
    Xu, Chi
    Zhang, Hui
    Zhang, Ya
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 2915 - 2920
  • [6] QDAP: Downsizing adaptive policy for cooperative multi-agent reinforcement learning
    Zhao, Zhitong
    Zhang, Ya
    Wang, Siying
    Zhang, Fan
    Zhang, Malu
    Chen, Wenyu
    KNOWLEDGE-BASED SYSTEMS, 2024, 294
  • [7] Parallel and distributed multi-agent reinforcement learning
    Kaya, M
    Arslan, A
    PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, 2001, : 437 - 441
  • [8] Coding for Distributed Multi-Agent Reinforcement Learning
    Wang, Baoqian
    Xie, Junfei
    Atanasov, Nikolay
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 10625 - 10631
  • [9] Distributed reinforcement learning in multi-agent networks
    Kar, Soummya
    Moura, Jose M. F.
    Poor, H. Vincent
    2013 IEEE 5TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP 2013), 2013, : 296 - +
  • [10] Decentralized Policy Gradient Descent Ascent for Safe Multi-Agent Reinforcement Learning
    Lu, Songtao
    Zhang, Kaiqing
    Chen, Tianyi
    Basar, Tamer
    Horesh, Lior
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8767 - 8775