Scalable Multi-Agent Reinforcement Learning with General Utilities

被引:0
|
作者
Ying, Donghao [1 ]
Ding, Yuhao [1 ]
Koppel, Alec [2 ]
Lavaei, Javad [1 ]
机构
[1] Univ Calif Berkeley, Dept Ind Engn & Operat Res, Berkeley, CA 94720 USA
[2] JP Morgan AI Res, Brooklyn, NY USA
关键词
COMPLEXITY;
D O I
10.23919/ACC55779.2023.10156072
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the scalable multi-agent reinforcement learning (MARL) with general utilities, defined as nonlinear functions of the team's long-term state-action occupancy measure. The objective is to find a localized policy that maximizes the average of the team's local utility functions without the full observability of each agent in the team. By exploiting the spatial correlation decay property of the network structure, we propose a scalable distributed policy gradient algorithm with shadow reward and localized policy that consists of three steps: (1) shadow reward estimation, (2) truncated shadow Q-function estimation, and (3) truncated policy gradient estimation and policy update. Our algorithm converges, with high probability, to epsilon-stationarity with O(epsilon(-2)) samples up to some approximation error that decreases exponentially in the communication radius. This is the first result in the literature on multi-agent RL with general utilities that does not require the full observability.
引用
收藏
页码:3977 / 3982
页数:6
相关论文
共 50 条
  • [1] Scalable Reinforcement Learning Policies for Multi-Agent Control
    Hsu, Christopher D.
    Jeong, Heejin
    Pappas, George J.
    Chaudhari, Pratik
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 4785 - 4791
  • [2] Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot
    Leibo, Joel Z.
    Duenez-Guzman, Edgar
    Vezhnevets, Alexander Sasha
    Agapiou, John P.
    Sunehag, Peter
    Koster, Raphael
    Matyas, Jayd
    Beattie, Charles
    Mordatch, Igor
    Graepel, Thore
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [3] Scalable Robust Multi-Agent Reinforcement Learning for Model Uncertainty
    Jwa, Younkyung
    Gwak, Minseon
    Kwak, Jiin
    Ahn, Chang Wook
    Park, PooGyeon
    [J]. 2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 3402 - 3407
  • [4] Multi-Agent Reinforcement Learning with General Utilities via Decentralized Shadow Reward Actor-Critic
    Zhang, Junyu
    Bedi, Amrit Singh
    Wang, Mengdi
    Koppel, Alec
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 9031 - 9039
  • [5] PowerNet: Multi-Agent Deep Reinforcement Learning for Scalable Powergrid Control
    Chen, Dong
    Chen, Kaian
    Li, Zhaojian
    Chu, Tianshu
    Yao, Rui
    Qiu, Feng
    Lin, Kaixiang
    [J]. IEEE TRANSACTIONS ON POWER SYSTEMS, 2022, 37 (02) : 1007 - 1017
  • [6] Scalable Autonomous Separation Assurance With Heterogeneous Multi-Agent Reinforcement Learning
    Brittain, Marc
    Wei, Peng
    [J]. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2022, 19 (04) : 2837 - 2848
  • [7] Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward
    Qu, Guannan
    Lin, Yiheng
    Wierman, Adam
    Li, Na
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [8] Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems
    Qu, Guannan
    Wierman, Adam
    Li, Na
    [J]. LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 256 - 266
  • [9] Scalable Multi-Agent Reinforcement Learning for Dynamic Coordinated Multipoint Clustering
    Hu, Fenghe
    Deng, Yansha
    Hamid Aghvami, A.
    [J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 2023, 71 (01) : 101 - 114
  • [10] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    [J]. 2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43