A distributed adaptive policy gradient method based on momentum for multi-agent reinforcement learning

被引：0

作者：

Shi, Junru ^{[1
]}

Wang, Xin ^{[2
]}

Zhang, Mingchuan ^{[1
]}

Liu, Muhua ^{[1
]}

Zhu, Junlong ^{[1
]}

Wu, Qingtao ^{[1
]}

机构：

[1] Henan Univ Sci & Technol, Sch Informat Engn, Luoyang 471023, Peoples R China

[2] Shanghai Int Studies Univ, Sch Business & Management, Shanghai 200083, Peoples R China

来源：

COMPLEX & INTELLIGENT SYSTEMS | 2024年 / 10卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Distributed reinforcement learning; Importance sampling; Momentum; Policy gradient methods; Variance reduction; ALGORITHMS;

D O I：

10.1007/s40747-024-01529-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Policy Gradient (PG) method is one of the most popular algorithms in Reinforcement Learning (RL). However, distributed adaptive variants of PG are rarely studied in multi-agent. For this reason, this paper proposes a distributed adaptive policy gradient algorithm (IS-DAPGM) incorporated with Adam-type updates and importance sampling technique. Furthermore, we also establish the theoretical convergence rate of O(1/T)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(1/\sqrt{T})$$\end{document}, where T represents the number of iterations, it can match the convergence rate of the state-of-the-art centralized policy gradient methods. In addition, many experiments are conducted in a multi-agent environment, which is a modification on the basis of Particle world environment. By comparing with some other distributed PG methods and changing the number of agents, we verify the performance of IS-DAPGM is more efficient than the existing methods.

引用

页码：7297 / 7310

页数：14

共 50 条

[21] Distributed interference coordination based on multi-agent deep reinforcement learning
Liu T.
Luo Y.
Yang C.
Tongxin Xuebao/Journal on Communications, 2020, 41 (07): : 38 - 48
[22] Multi-Agent Deep Reinforcement Learning Based Distributed Resource Allocation
Urmonov, Odilbek
Kim, HyungWon
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
[23] Distributed Task Offloading based on Multi-Agent Deep Reinforcement Learning
Hu, Shucheng
Ren, Tao
Niu, Jianwei
Hu, Zheyuan
Xing, Guoliang
2021 17TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING (MSN 2021), 2021, : 575 - 583
[24] Distributed localization for IoT with multi-agent reinforcement learning
Jia, Jie
Yu, Ruoying
Du, Zhenjun
Chen, Jian
Wang, Qinghu
Wang, Xingwei
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (09): : 7227 - 7240
[25] Distributed Coordination Guidance in Multi-Agent Reinforcement Learning
Lau, Qiangfeng Peter
Lee, Mong Li
Hsu, Wynne
2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, : 456 - 463
[26] Distributed reinforcement learning in multi-agent decision systems
Giráldez, JI
Borrajo, D
PROGRESS IN ARTIFICIAL INTELLIGENCE-IBERAMIA 98, 1998, 1484 : 148 - 159
[27] Distributed localization for IoT with multi-agent reinforcement learning
Jie Jia
Ruoying Yu
Zhenjun Du
Jian Chen
Qinghu Wang
Xingwei Wang
Neural Computing and Applications, 2022, 34 : 7227 - 7240
[28] Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient
Li, Shihui
Wu, Yi
Cui, Xinyue
Dong, Honghua
Fang, Fei
Russell, Stuart
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4213 - 4220
[29] Train rescheduling method based on multi-agent reinforcement learning
Cao, Yuli
Xu, Zhongwei
Mei, Meng
2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2022, : 301 - 305
[30] Decentralized Natural Policy Gradient with Variance Reduction for Collaborative Multi-Agent Reinforcement Learning
Chen, Jinchi
Feng, Jie
Gao, Weiguo
Wei, Ke
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25

← 1 2 3 4 5 →