Multi-Agent Reinforcement Learning With Privacy Preservation for Continuous Double Auction-Based P2P Energy Trading

被引：16

作者：

Zheng, Jiehui ^{[1
]}

Liang, Ze-Ting ^{[1
]}

Li, Yuanzheng ^{[2
]}

Li, Zhigang ^{[1
]}

Wu, Qing-Hua ^{[1
]}

机构：

[1] South China Univ Technol, Sch Elect Power Engn, Guangzhou 510640, Peoples R China

[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Key Lab lmage Informat Proc & Intelligent Control, Minist Educ China, Wuhan 430074, Peoples R China

来源：

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS | 2024年 / 20卷 / 04期

关键词：

Privacy; Training; Tariffs; Peer-to-peer computing; Energy management; Scalability; Power system dynamics; Continue double auction (CDA); dynamic potential based reward shaping; mean-field approximation; multiagent twin delayed deep deterministic policy gradient; peer-to-peer (P2P);

D O I：

10.1109/TII.2023.3348823

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With increasing deployment of distributed energy resources, the energy market which aims for local generation and load profile redistribution is facing the challenge to accommodate various types of participants. To realize social welfare maximization with privacy preserving in a dynamic energy market, this article propose a multiagent reinforcement learning (MARL) method for quotation decision optimization in continuous double auction (CDA)-based peer-to-peer (P2P) energy market. To address the nonstationarity and privacy violation brought by multiagent context, we utilize mean-field approximation to abstract the unauthorized local information of other agents from the public market dynamics. An abstract Q-value function is developed for each agent to infer the neighbor agents' local observation and action through the public clearing results in the dynamic CDA market. Moreover, to avoid sparse reward so as to stabilize the learning process, we propose a dynamic potential-based reward shaping term in the reward. Without altering the learnt optimal policies, the agents can be informed with the additional energy storage state as the reward shaping in each time instants. To validate the effectiveness and economy of our proposed method, simulation studies are conducted on a real-world dataset. Simulation results show that the proposed MARL method produces up to 17% more convergent episodic reward and 67% less energy bills which indicates competitive convergence performance and significant economic benefits.

引用

页码：6582 / 6590

页数：9

共 50 条

[41] A Multi-Agent Reinforcement Learning Approach for Blockchain-based Electricity Trading System
Cao, Yifan
Ren, Xiaoxu
Qiu, Chao
Wang, Xiaofei
Yao, Haipeng
Yu, F. Richard
2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
[42] P2P multi-agent data transfer and aggregation in wireless sensor networks
Shakshuki, Elhadi
Hussain, Sajid
Matin, Abdur R.
Matin, Abdul W.
2006 IEEE INTERNATIONAL CONFERENCE ON MOBILE ADHOC AND SENSOR SYSTEMS, VOLS 1 AND 2, 2006, : 615 - +
[43] Multi-agent system technology for P2P applications on small portable devices
Purvis, M
Garside, N
Cranefield, S
Nowostawski, M
De Oliveira, M
AGENTS AND PEER-TO-PEER COMPUTING, 2005, 3601 : 153 - 160
[44] Deep Learning Enabled Predictive Model for P2P Energy Trading in TEM
Sekhar, Pudi
Jose, T. J. Benedict
Parvathy, Velmurugan Subbiah
Lydia, E. Laxmi
Kadry, Seifedine
Pin, Kuntha
Nam, Yunyoung
CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 71 (01): : 1473 - 1487
[45] Deep learning enabled predictive model for p2p energy trading in tem
Sekhar, Pudi
Benedict Jose, T.J.
Parvathy, Velmurugan Subbiah
Laxmi Lydia, E.
Kadry, Seifedine
Pin, Kuntha
Nam, Yunyoung
Computers, Materials and Continua, 2022, 71 (01): : 1473 - 1487
[46] Privacy-Preserving Authentication Mechanism for P2P Energy Trading in Smart Grid Networks
Pathak, Aditya
Al-Anbagi, Irfan
Hamilton, Howard J.
ICC 2024 - IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2024, : 3085 - 3090
[47] Quantum Finance and Fuzzy Reinforcement Learning-Based Multi-agent Trading SystemQuantum Finance and Fuzzy Reinforcement Learning-Based Multi-agent Trading SystemC. Cheng et al.
Chi Cheng
Bingshen Chen
Ziting Xiao
Raymond S. T. Lee
International Journal of Fuzzy Systems, 2024, 26 (7) : 2224 - 2245
[48] The P-graph application extension in multi-period P2P energy trading
Kong, Karen Gah Hie
Lee, Alvin Guo Jian
Teng, Sin Yong
Leong, Wei Dong
Orosz, Akos
Friedler, Ferenc
Sunarso, Jaka
How, Bing Shen
RENEWABLE & SUSTAINABLE ENERGY REVIEWS, 2024, 200
[49] P2P electricity trading model for urban multi-virtual power plants based on double-layer energy blockchain
Zhou, Kaile
Xing, Hengheng
Ding, Tao
SUSTAINABLE ENERGY GRIDS & NETWORKS, 2024, 39
[50] Peer-to-peer energy trading with energy trading consistency in interconnected multi-energy microgrids: A multi-agent deep reinforcement learning approach
Cui, Yang
Xu, Yang
Wang, Yijian
Zhao, Yuting
Zhu, Han
Cheng, Dingran
INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2024, 156

← 1 2 3 4 5 →