Non-Stationary Policy Learning for Multi-Timescale Multi-Agent Reinforcement Learning

被引:0
|
作者
Emami, Patrick [1 ]
Zhang, Xiangyu [1 ]
Biagioni, David [2 ]
Zamzam, Ahmed S. [1 ]
机构
[1] Natl Renewable Energy Lab, Golden, CO 80401 USA
[2] Maplewell Energy, Broomfield, CO 80021 USA
关键词
D O I
10.1109/CDC49753.2023.10384223
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In multi-timescale multi-agent reinforcement learning (MARL), agents interact across different timescales. In general, policies for time-dependent behaviors, such as those induced by multiple timescales, are non-stationary. Learning non-stationary policies is challenging and typically requires sophisticated or inefficient algorithms. Motivated by the prevalence of this control problem in real-world complex systems, we introduce a simple framework for learning non-stationary policies for multi-timescale MARL. Our approach uses available information about agent timescales to define and learn periodic multi-agent policies. In detail, we theoretically demonstrate that the effects of non-stationarity introduced by multiple timescales can be learned by a periodic multi-agent policy. To learn such policies, we propose a policy gradient algorithm that parameterizes the actor and critic with phase-functioned neural networks, which provide an inductive bias for periodicity. The framework's ability to effectively learn multi-timescale policies is validated on a gridworld and building energy management environment.
引用
收藏
页码:2372 / 2378
页数:7
相关论文
共 50 条
  • [1] DEALING WITH NON-STATIONARITY IN DECENTRALIZED COOPERATIVE MULTI-AGENT DEEP REINFORCEMENT LEARNING VIA MULTI-TIMESCALE LEARNING
    Nekoei, Hadi
    Badrinaaraayanan, Akilesh
    Sinha, Amit
    Amini, Mohammad
    Rajendran, Janarthanan
    Mahajan, Aditya
    Chandar, Sarath
    [J]. CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232, 2023, 232 : 376 - 398
  • [2] Multi-timescale voltage control for distribution system based on multi-agent deep reinforcement learning
    Wu, Zhi
    Li, Yiqi
    Gu, Wei
    Dong, Zengbo
    Zhao, Jingtao
    Liu, Weiliang
    Zhang, Xiao-Ping
    Liu, Pengxiang
    Sun, Qirun
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2023, 147
  • [3] Prediction-Based Multi-Agent Reinforcement Learning in Inherently Non-Stationary Environments
    Marinescu, Andrei
    Dusparic, Ivana
    Clarke, Siobhan
    [J]. ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 2017, 12 (02)
  • [4] Multi-timescale nexting in a reinforcement learning robot
    Modayil, Joseph
    White, Adam
    Sutton, Richard S.
    [J]. ADAPTIVE BEHAVIOR, 2014, 22 (02) : 146 - 160
  • [5] TEAM POLICY LEARNING FOR MULTI-AGENT REINFORCEMENT LEARNING
    Cassano, Lucas
    Alghunaim, Sulaiman A.
    Sayed, Ali H.
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3062 - 3066
  • [6] Reinforcement learning for multi-agent patrol policy
    Lab. of Complex Systems and Intelligence Sciences, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
    [J]. Proc. IEEE Int. Conf. Cognitive Informatics, ICCI, (530-535):
  • [7] P-MARL: Prediction-Based Multi-Agent Reinforcement Learning for Non-Stationary Environments
    Marinescu, Andrei
    Dusparic, Ivana
    Taylor, Adam
    Cahill, Vinny
    Clarke, Siobhan
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 1897 - 1898
  • [8] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    [J]. 2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43
  • [9] Uncertainty modified policy for multi-agent reinforcement learning
    Zhao, Xinyu
    Liu, Jianxiang
    Wu, Faguo
    Zhang, Xiao
    Wang, Guojian
    [J]. APPLIED INTELLIGENCE, 2024, 54 (22) : 12020 - 12034
  • [10] Cooperative Multi-Agent Reinforcement Learning in a Large Stationary Environment
    Zemzem, Wiem
    Tagina, Moncef
    [J]. 2017 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS 2017), 2017, : 365 - 371