Distributed entropy-regularized multi-agent reinforcement learning with policy consensus

被引:1
|
作者
Hu, Yifan [1 ]
Fu, Junjie [1 ]
Wen, Guanghui [1 ]
Lv, Yuezu [2 ]
Ren, Wei [3 ]
机构
[1] Southeast Univ, Sch Math, Nanjing 210096, Peoples R China
[2] Beijing Inst Technol, Adv Res Inst Multidisciplinary Sci, Beijing 100081, Peoples R China
[3] Univ Calif Riverside, Dept Elect & Comp Engn, Riverside, CA 92521 USA
关键词
Distributed actor-critic algorithm; Networked multi-agent system; Entropy regularization; Deep reinforcement learning; ALGORITHM; NETWORKS;
D O I
10.1016/j.automatica.2024.111652
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sample efficiency is a limiting factor for existing distributed multi -agent reinforcement learning (MARL) algorithms over networked multi -agent systems. In this paper, the sample efficiency problem is tackled by formally incorporating the entropy regularization into the distributed MARL algorithm design. Firstly, a new entropy -regularized MARL problem is formulated under the model of networked multi -agent Markov decision processes with observation -based policies and homogeneous agents, where the policy parameter sharing among the agents provably preserves the optimality. Secondly, an on -policy distributed actor-critic algorithm is proposed, where each agent shares its parameters of both the critic and actor for consensus update. Then, the convergence analysis of the proposed algorithm is provided based on the stochastic approximation theory under the assumption of linear function approximation of the critic. Furthermore, a practical off -policy version of the proposed algorithm is developed which possesses scalability, data efficiency and learning stability. Finally, the proposed distributed algorithm is compared against the solid baselines including two classic centralized training algorithms in the multi -agent particle environment, whose learning performance is empirically demonstrated through extensive simulation experiments. (c) 2024 Elsevier Ltd. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Towards a Distributed Framework for Multi-Agent Reinforcement Learning Research
    Zhou, Yutai
    Manuel, Shawn
    Morales, Peter
    Li, Sheng
    Pena, Jaime
    Allen, Ross
    2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,
  • [32] Multi-Agent Deep Reinforcement Learning for Distributed Satellite Routing
    Lozano-Cuadra, Federico
    Soret, Beatriz
    2024 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING FOR COMMUNICATION AND NETWORKING, ICMLCN 2024, 2024, : 554 - 555
  • [33] A Multi-agent Reinforcement Learning Perspective on Distributed Traffic Engineering
    Geng, Nan
    Lan, Tian
    Aggarwal, Vaneet
    Yang, Yuan
    Xu, Mingwei
    2020 IEEE 28TH INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS (IEEE ICNP 2020), 2020,
  • [34] Distributed hierarchical reinforcement learning in multi-agent adversarial environments
    Naderializadeh, Navid
    Soleyman, Sean
    Hung, Fan
    Khosla, Deepak
    Chen, Yang
    Fadaie, Joshua G.
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS IV, 2022, 12113
  • [35] Distributed consensus for nonlinear multi-agent systems with two-time-scales: A hybrid reinforcement learning consensus algorithm*
    Peng, Chuanjun
    Xia, Jianwei
    Wang, Jing
    Shen, Hao
    INFORMATION SCIENCES, 2023, 641
  • [36] Off-policy Reinforcement Learning for Distributed Output Synchronization of Linear Multi-agent Systems
    Kiumarsi, Bahare
    Lewis, Frank L.
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 1877 - 1884
  • [37] A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning
    Suttle, Wesley
    Yang, Zhuoran
    Zhang, Kaiqing
    Wang, Zhaoran
    Basar, Tamer
    Liu, Ji
    IFAC PAPERSONLINE, 2020, 53 (02): : 1549 - 1554
  • [38] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43
  • [39] Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning
    Cui, Kai
    Koeppl, Heinz
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [40] Certified Policy Smoothing for Cooperative Multi-Agent Reinforcement Learning
    Mu, Ronghui
    Ruan, Wenjie
    Marcolino, Leandro Soriano
    Jin, Gaojie
    Ni, Qiang
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 12, 2023, : 15046 - 15054