A distributed adaptive policy gradient method based on momentum for multi-agent reinforcement learning

被引:0
|
作者
Shi, Junru [1 ]
Wang, Xin [2 ]
Zhang, Mingchuan [1 ]
Liu, Muhua [1 ]
Zhu, Junlong [1 ]
Wu, Qingtao [1 ]
机构
[1] Henan Univ Sci & Technol, Sch Informat Engn, Luoyang 471023, Peoples R China
[2] Shanghai Int Studies Univ, Sch Business & Management, Shanghai 200083, Peoples R China
基金
中国国家自然科学基金;
关键词
Distributed reinforcement learning; Importance sampling; Momentum; Policy gradient methods; Variance reduction; ALGORITHMS;
D O I
10.1007/s40747-024-01529-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Policy Gradient (PG) method is one of the most popular algorithms in Reinforcement Learning (RL). However, distributed adaptive variants of PG are rarely studied in multi-agent. For this reason, this paper proposes a distributed adaptive policy gradient algorithm (IS-DAPGM) incorporated with Adam-type updates and importance sampling technique. Furthermore, we also establish the theoretical convergence rate of O(1/T)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(1/\sqrt{T})$$\end{document}, where T represents the number of iterations, it can match the convergence rate of the state-of-the-art centralized policy gradient methods. In addition, many experiments are conducted in a multi-agent environment, which is a modification on the basis of Particle world environment. By comparing with some other distributed PG methods and changing the number of agents, we verify the performance of IS-DAPGM is more efficient than the existing methods.
引用
收藏
页码:7297 / 7310
页数:14
相关论文
共 50 条
  • [21] Distributed interference coordination based on multi-agent deep reinforcement learning
    Liu T.
    Luo Y.
    Yang C.
    Tongxin Xuebao/Journal on Communications, 2020, 41 (07): : 38 - 48
  • [22] Multi-Agent Deep Reinforcement Learning Based Distributed Resource Allocation
    Urmonov, Odilbek
    Kim, HyungWon
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [23] Distributed Task Offloading based on Multi-Agent Deep Reinforcement Learning
    Hu, Shucheng
    Ren, Tao
    Niu, Jianwei
    Hu, Zheyuan
    Xing, Guoliang
    2021 17TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING (MSN 2021), 2021, : 575 - 583
  • [24] Distributed localization for IoT with multi-agent reinforcement learning
    Jia, Jie
    Yu, Ruoying
    Du, Zhenjun
    Chen, Jian
    Wang, Qinghu
    Wang, Xingwei
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (09): : 7227 - 7240
  • [25] Distributed Coordination Guidance in Multi-Agent Reinforcement Learning
    Lau, Qiangfeng Peter
    Lee, Mong Li
    Hsu, Wynne
    2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, : 456 - 463
  • [26] Distributed reinforcement learning in multi-agent decision systems
    Giráldez, JI
    Borrajo, D
    PROGRESS IN ARTIFICIAL INTELLIGENCE-IBERAMIA 98, 1998, 1484 : 148 - 159
  • [27] Distributed localization for IoT with multi-agent reinforcement learning
    Jie Jia
    Ruoying Yu
    Zhenjun Du
    Jian Chen
    Qinghu Wang
    Xingwei Wang
    Neural Computing and Applications, 2022, 34 : 7227 - 7240
  • [28] Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient
    Li, Shihui
    Wu, Yi
    Cui, Xinyue
    Dong, Honghua
    Fang, Fei
    Russell, Stuart
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4213 - 4220
  • [29] Train rescheduling method based on multi-agent reinforcement learning
    Cao, Yuli
    Xu, Zhongwei
    Mei, Meng
    2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2022, : 301 - 305
  • [30] Decentralized Natural Policy Gradient with Variance Reduction for Collaborative Multi-Agent Reinforcement Learning
    Chen, Jinchi
    Feng, Jie
    Gao, Weiguo
    Wei, Ke
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25