Distributed Multiagent Reinforcement Learning With Action Networks for Dynamic Economic Dispatch

被引:8
|
作者
Hu, Chengfang [1 ]
Wen, Guanghui [2 ]
Wang, Shuai [3 ,4 ]
Fu, Junjie [2 ]
Yu, Wenwu [2 ]
机构
[1] Southeast Univ, Sch Cyber Sci & Engn, Nanjing 211189, Peoples R China
[2] Southeast Univ, Sch Math, Dept Syst Sci, Nanjing 211189, Peoples R China
[3] Beihang Univ, Res Inst Frontier Sci, Beijing 100191, Peoples R China
[4] Beihang Univ, Sch Comp Sci & Engn, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
Power demand; Heuristic algorithms; Prediction algorithms; Couplings; Approximation algorithms; Power system stability; Convex functions; Distributed optimization; dynamic economic dispatch; multiagent reinforcement learning (MARL); smart grids; VISIBLE IMAGE FUSION; PERFORMANCE; INFORMATION; ALGORITHM; PROTEIN;
D O I
10.1109/TNNLS.2023.3234049
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A new class of distributed multiagent reinforcement learning (MARL) algorithm suitable for problems with coupling constraints is proposed in this article to address the dynamic economic dispatch problem (DEDP) in smart grids. Specifically, the assumption made commonly in most existing results on the DEDP that the cost functions are known and/or convex is removed in this article. A distributed projection optimization algorithm is designed for the generation units to find the feasible power outputs satisfying the coupling constraints. By using a quadratic function to approximate the state-action value function of each generation unit, the approximate optimal solution of the original DEDP can be obtained by solving a convex optimization problem. Then, each action network utilizes a neural network (NN) to learn the relationship between the total power demand and the optimal power output of each generation unit, such that the algorithm obtains the generalization ability to predict the optimal power output distribution on an unseen total power demand. Furthermore, an improved experience replay mechanism is introduced into the action networks to improve the stability of the training process. Finally, the effectiveness and robustness of the proposed MARL algorithm are verified by simulation.
引用
收藏
页码:9553 / 9564
页数:12
相关论文
共 50 条
  • [41] Multiagent reinforcement learning dynamic spectrum access in cognitive radios
    Chun, Wu
    Mingyong, Yin
    Shaoliang, Ma
    Hong, Jiang
    Sensors and Transducers, 2014, 164 (02): : 170 - 175
  • [42] Multiagent Reinforcement Learning for Hyperparameter Optimization of Convolutional Neural Networks
    Iranfar, Arman
    Zapater, Marina
    Atienza, David
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (04) : 1034 - 1047
  • [43] Distributed Policy Evaluation with Fractional Order Dynamics in Multiagent Reinforcement Learning
    Dai, Wei
    Wang, Wei
    Mao, Zhongtian
    Jiang, Ruwen
    Nian, Fudong
    Li, Teng
    SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
  • [44] Distributed Routing Optimization Algorithm for FANET Based on Multiagent Reinforcement Learning
    Ke, Yaqi
    Huang, Kai
    Qiu, Xiulin
    Song, Bo
    Xu, Lei
    Yin, Jun
    Yang, Yuwang
    IEEE SENSORS JOURNAL, 2024, 24 (15) : 24851 - 24864
  • [45] Safe Adaptive Policy Transfer Reinforcement Learning for Distributed Multiagent Control
    Du, Bin
    Xie, Wei
    Li, Yang
    Yang, Qisong
    Zhang, Weidong
    Negenborn, Rudy R.
    Pang, Yusong
    Chen, Hongtian
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 1939 - 1946
  • [46] Toward Packet Routing With Fully Distributed Multiagent Deep Reinforcement Learning
    You, Xinyu
    Li, Xuanjie
    Xu, Yuedong
    Feng, Hui
    Zhao, Jin
    Yan, Huaicheng
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (02): : 855 - 868
  • [47] Safe Adaptive Policy Transfer Reinforcement Learning for Distributed Multiagent Control
    Du, Bin
    Xie, Wei
    Li, Yang
    Yang, Qisong
    Zhang, Weidong
    Negenborn, Rudy R.
    Pang, Yusong
    Chen, Hongtian
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 1939 - 1946
  • [48] Dynamic action sequences in reinforcement learning
    Moren, J
    FROM ANIMALS TO ANIMATS 5, 1998, : 366 - 371
  • [49] Adaptive Multiagent Model Based on Reinforcement Learning for Distributed Generation Systems
    Divenyi, Daniel
    Dan, Andras
    2012 23RD INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2012, : 303 - 307
  • [50] Saddle dynamic based distributed algorithm for economic dispatch problem
    Shi, Xia-Sheng
    Zheng, Rong-Hao
    Lin, Zhi-Yun
    Yan, Gang-Feng
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2020, 54 (04): : 678 - 683