Distributed reinforcement learning in multi-agent networks

被引:0
|
作者
Kar, Soummya [1 ]
Moura, Jose M. F. [1 ]
Poor, H. Vincent [2 ]
机构
[1] Carnegie Mellon Univ, Dept ECE, Pittsburgh, PA 15213 USA
[2] Princeton Univ, Dept EE, Princeton, NJ 08544 USA
基金
美国国家科学基金会;
关键词
Multi-agent stochastic control; distributed Q-learning; reinforcement learning; collaborative network processing; consensus plus innovations; distributed stochastic approximation;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Distributed reinforcement learning algorithms for collaborative multi-agent Markov decision processes (MDPs) are presented and analyzed. The networked setup consists of a collection of agents (learners) which respond differently (depending on their instantaneous one-stage random costs) to a global controlled state and the control actions of a remote controller. With the objective of jointly learning the optimal stationary control policy (in the absence of global state transition and local agent cost statistics) that minimizes network-averaged infinite horizon discounted cost, the paper presents distributed variants of Q-learning of the consensus + innovations type in which each agent sequentially refines its learning parameters by locally processing its instantaneous payoff data and the information received from neighboring agents. Under broad conditions on the multi-agent decision model and mean connectivity of the inter-agent communication network, the proposed distributed algorithms are shown to achieve optimal learning asymptotically, i. e., almost surely (a. s.) each network agent is shown to learn the value function and the optimal stationary control policy of the collaborative MDP asymptotically. Further, convergence rate estimates for the proposed class of distributed learning algorithms are obtained.
引用
收藏
页码:296 / +
页数:2
相关论文
共 50 条
  • [1] Multi-Agent Reinforcement Learning With Distributed Targeted Multi-Agent Communication
    Xu, Chi
    Zhang, Hui
    Zhang, Ya
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 2915 - 2920
  • [2] Multi-agent systems on sensor networks: A distributed reinforcement learning approach
    Tham, CK
    Renaud, JC
    PROCEEDINGS OF THE 2005 INTELLIGENT SENSORS, SENSOR NETWORKS & INFORMATION PROCESSING CONFERENCE, 2005, : 423 - 429
  • [3] Parallel and distributed multi-agent reinforcement learning
    Kaya, M
    Arslan, A
    PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, 2001, : 437 - 441
  • [4] Coding for Distributed Multi-Agent Reinforcement Learning
    Wang, Baoqian
    Xie, Junfei
    Atanasov, Nikolay
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 10625 - 10631
  • [5] Distributed Transmission Control for Wireless Networks using Multi-Agent Reinforcement Learning
    Farquhar, Collin
    Kumar, Prem
    Jagannath, Anu
    Jagannath, Jithin
    BIG DATA IV: LEARNING, ANALYTICS, AND APPLICATIONS, 2022, 12097
  • [6] Distributed localization for IoT with multi-agent reinforcement learning
    Jia, Jie
    Yu, Ruoying
    Du, Zhenjun
    Chen, Jian
    Wang, Qinghu
    Wang, Xingwei
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (09): : 7227 - 7240
  • [7] Distributed Coordination Guidance in Multi-Agent Reinforcement Learning
    Lau, Qiangfeng Peter
    Lee, Mong Li
    Hsu, Wynne
    2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, : 456 - 463
  • [8] Distributed reinforcement learning in multi-agent decision systems
    Giráldez, JI
    Borrajo, D
    PROGRESS IN ARTIFICIAL INTELLIGENCE-IBERAMIA 98, 1998, 1484 : 148 - 159
  • [9] Distributed localization for IoT with multi-agent reinforcement learning
    Jie Jia
    Ruoying Yu
    Zhenjun Du
    Jian Chen
    Qinghu Wang
    Xingwei Wang
    Neural Computing and Applications, 2022, 34 : 7227 - 7240
  • [10] DISTRIBUTED RESOURCE ALLOCATION IN 5G NETWORKS WITH MULTI-AGENT REINFORCEMENT LEARNING
    Menard, Jon
    Al-Habashna, Ala'a
    Wainer, Gabriel
    Boudreau, Gary
    PROCEEDINGS OF THE 2022 ANNUAL MODELING AND SIMULATION CONFERENCE (ANNSIM'22), 2022, : 802 - 813