Distributed entropy-regularized multi-agent reinforcement learning with policy consensus

被引:1
|
作者
Hu, Yifan [1 ]
Fu, Junjie [1 ]
Wen, Guanghui [1 ]
Lv, Yuezu [2 ]
Ren, Wei [3 ]
机构
[1] Southeast Univ, Sch Math, Nanjing 210096, Peoples R China
[2] Beijing Inst Technol, Adv Res Inst Multidisciplinary Sci, Beijing 100081, Peoples R China
[3] Univ Calif Riverside, Dept Elect & Comp Engn, Riverside, CA 92521 USA
关键词
Distributed actor-critic algorithm; Networked multi-agent system; Entropy regularization; Deep reinforcement learning; ALGORITHM; NETWORKS;
D O I
10.1016/j.automatica.2024.111652
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sample efficiency is a limiting factor for existing distributed multi -agent reinforcement learning (MARL) algorithms over networked multi -agent systems. In this paper, the sample efficiency problem is tackled by formally incorporating the entropy regularization into the distributed MARL algorithm design. Firstly, a new entropy -regularized MARL problem is formulated under the model of networked multi -agent Markov decision processes with observation -based policies and homogeneous agents, where the policy parameter sharing among the agents provably preserves the optimality. Secondly, an on -policy distributed actor-critic algorithm is proposed, where each agent shares its parameters of both the critic and actor for consensus update. Then, the convergence analysis of the proposed algorithm is provided based on the stochastic approximation theory under the assumption of linear function approximation of the critic. Furthermore, a practical off -policy version of the proposed algorithm is developed which possesses scalability, data efficiency and learning stability. Finally, the proposed distributed algorithm is compared against the solid baselines including two classic centralized training algorithms in the multi -agent particle environment, whose learning performance is empirically demonstrated through extensive simulation experiments. (c) 2024 Elsevier Ltd. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] MABQN: Multi-agent reinforcement learning algorithm with discrete policy
    Xie, Qing
    Wang, Zicheng
    Fang, Yuyuan
    Li, Yukai
    NEUROCOMPUTING, 2025, 626
  • [42] Adversarial attacks in consensus-based multi-agent reinforcement learning
    Figura, Martin
    Kosaraju, Krishna Chaitanya
    Gupta, Vijay
    2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 3050 - 3055
  • [43] Distributed Consensus-Based Multi-Agent Off-Policy Temporal-Difference Learning
    Stankovic, Milos S.
    Beko, Marko
    Stankovic, Srdjan S.
    2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 5976 - 5981
  • [44] Hierarchical Reinforcement Learning with Opponent Modeling for Distributed Multi-agent Cooperation
    Liang, Zhixuan
    Cao, Jiannong
    Jiang, Shan
    Saxena, Divya
    Xu, Huafeng
    2022 IEEE 42ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2022), 2022, : 884 - 894
  • [45] Multi-Agent Distributed Reinforcement Learning for Making Decentralized Offloading Decisions
    Tan, Jing
    Khalili, Ramin
    Karl, Holger
    Hecker, Artur
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2022), 2022, : 2098 - 2107
  • [46] Distributed, Heterogeneous, Multi-Agent Social Coordination via Reinforcement Learning
    Shi, Dongqing
    Sauter, Michael Z.
    Kralik, Jerald D.
    2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO 2009), VOLS 1-4, 2009, : 653 - 658
  • [47] Dynamic distributed constraint optimization using multi-agent reinforcement learning
    Shokoohi, Maryam
    Afsharchi, Mohsen
    Shah-Hoseini, Hamed
    SOFT COMPUTING, 2022, 26 (08) : 3601 - 3629
  • [48] Dynamic distributed constraint optimization using multi-agent reinforcement learning
    Maryam Shokoohi
    Mohsen Afsharchi
    Hamed Shah-Hoseini
    Soft Computing, 2022, 26 : 3601 - 3629
  • [49] Distributed Signal Control of Multi-agent Reinforcement Learning Based on Game
    Qu Z.-W.
    Pan Z.-T.
    Chen Y.-H.
    Li H.-T.
    Wang X.
    Chen, Yong-Heng (cyh@jlu.edu.cn), 1600, Science Press (20): : 76 - 82and100
  • [50] Distributed Multi-agent Reinforcement Learning for Directional UAV Network Control
    He, Linsheng
    Zhao, Jiamiao
    Hu, Fei
    PROCEEDINGS OF THE 32ND INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, HPDC 2023, 2023, : 317 - 318