Distributed Multiagent Reinforcement Learning With Action Networks for Dynamic Economic Dispatch

被引：8

作者：

Hu, Chengfang ^{[1
]}

Wen, Guanghui ^{[2
]}

Wang, Shuai ^{[3
,4
]}

Fu, Junjie ^{[2
]}

Yu, Wenwu ^{[2
]}

机构：

[1] Southeast Univ, Sch Cyber Sci & Engn, Nanjing 211189, Peoples R China

[2] Southeast Univ, Sch Math, Dept Syst Sci, Nanjing 211189, Peoples R China

[3] Beihang Univ, Res Inst Frontier Sci, Beijing 100191, Peoples R China

[4] Beihang Univ, Sch Comp Sci & Engn, Beijing 100191, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Power demand; Heuristic algorithms; Prediction algorithms; Couplings; Approximation algorithms; Power system stability; Convex functions; Distributed optimization; dynamic economic dispatch; multiagent reinforcement learning (MARL); smart grids; VISIBLE IMAGE FUSION; PERFORMANCE; INFORMATION; ALGORITHM; PROTEIN;

D O I：

10.1109/TNNLS.2023.3234049

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A new class of distributed multiagent reinforcement learning (MARL) algorithm suitable for problems with coupling constraints is proposed in this article to address the dynamic economic dispatch problem (DEDP) in smart grids. Specifically, the assumption made commonly in most existing results on the DEDP that the cost functions are known and/or convex is removed in this article. A distributed projection optimization algorithm is designed for the generation units to find the feasible power outputs satisfying the coupling constraints. By using a quadratic function to approximate the state-action value function of each generation unit, the approximate optimal solution of the original DEDP can be obtained by solving a convex optimization problem. Then, each action network utilizes a neural network (NN) to learn the relationship between the total power demand and the optimal power output of each generation unit, such that the algorithm obtains the generalization ability to predict the optimal power output distribution on an unseen total power demand. Furthermore, an improved experience replay mechanism is introduced into the action networks to improve the stability of the training process. Finally, the effectiveness and robustness of the proposed MARL algorithm are verified by simulation.

引用

页码：9553 / 9564

页数：12

共 50 条

[41] Multiagent reinforcement learning dynamic spectrum access in cognitive radios
Chun, Wu
Mingyong, Yin
Shaoliang, Ma
Hong, Jiang
Sensors and Transducers, 2014, 164 (02): : 170 - 175
[42] Multiagent Reinforcement Learning for Hyperparameter Optimization of Convolutional Neural Networks
Iranfar, Arman
Zapater, Marina
Atienza, David
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (04) : 1034 - 1047
[43] Distributed Policy Evaluation with Fractional Order Dynamics in Multiagent Reinforcement Learning
Dai, Wei
Wang, Wei
Mao, Zhongtian
Jiang, Ruwen
Nian, Fudong
Li, Teng
SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
[44] Distributed Routing Optimization Algorithm for FANET Based on Multiagent Reinforcement Learning
Ke, Yaqi
Huang, Kai
Qiu, Xiulin
Song, Bo
Xu, Lei
Yin, Jun
Yang, Yuwang
IEEE SENSORS JOURNAL, 2024, 24 (15) : 24851 - 24864
[45] Safe Adaptive Policy Transfer Reinforcement Learning for Distributed Multiagent Control
Du, Bin
Xie, Wei
Li, Yang
Yang, Qisong
Zhang, Weidong
Negenborn, Rudy R.
Pang, Yusong
Chen, Hongtian
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 1939 - 1946
[46] Toward Packet Routing With Fully Distributed Multiagent Deep Reinforcement Learning
You, Xinyu
Li, Xuanjie
Xu, Yuedong
Feng, Hui
Zhao, Jin
Yan, Huaicheng
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (02): : 855 - 868
[47] Safe Adaptive Policy Transfer Reinforcement Learning for Distributed Multiagent Control
Du, Bin
Xie, Wei
Li, Yang
Yang, Qisong
Zhang, Weidong
Negenborn, Rudy R.
Pang, Yusong
Chen, Hongtian
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 1939 - 1946
[48] Dynamic action sequences in reinforcement learning
Moren, J
FROM ANIMALS TO ANIMATS 5, 1998, : 366 - 371
[49] Adaptive Multiagent Model Based on Reinforcement Learning for Distributed Generation Systems
Divenyi, Daniel
Dan, Andras
2012 23RD INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2012, : 303 - 307
[50] Saddle dynamic based distributed algorithm for economic dispatch problem
Shi, Xia-Sheng
Zheng, Rong-Hao
Lin, Zhi-Yun
Yan, Gang-Feng
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2020, 54 (04): : 678 - 683

← 1 2 3 4 5 →