Dynamic data cleaning method of abnormal and missing data in a distribution network based on machine learning

被引:0
|
作者
Mei, Yujie [1 ]
Li, Yong [1 ]
Zhou, Wangfeng [2 ]
Guo, Yixiu [1 ]
Deng, Wei [3 ]
Qiao, Xuebo [1 ]
机构
[1] School of Electrical Engineering and Information, Hunan University, Changsha,410082, China
[2] State Grid Wenzhou Power Supply Co., Ltd., Wenzhou,325000, China
[3] State Grid Human Electric Power Co., Ltd. Research Institute, Changsha,410007, China
基金
中国国家自然科学基金;
关键词
Cleaning - Filling - Gaussian distribution - Interpolation - Least squares approximations - Regression analysis;
D O I
10.19783/j.cnki.pspc.221000
中图分类号
学科分类号
摘要
There is a limitation of manual setting of an abnormal data judgment threshold and there will be inefficient filling of missing data in the traditional process of data cleaning in a distribution network. This paper proposes an integrated dynamic cleaning method for distribution network abnormal and missing data based on machine learning. First, based on a local outlier factor and Gaussian mixture model, an improved dynamic identification algorithm is proposed to realize the automatic selection of threshold of abnormal data. Second, based on the random forest algorithm and least squares regression method, a dynamic filling algorithm for missing data is proposed. Depending on the length of missing data, it adaptively optimizes the filling algorithm to ensure filling accuracy and reduce running time. An integrated dynamic cleaning architecture is built through abnormal data identification and missing data interpolation. The data of the distribution network in a certain area of Hunan are used for example verification. The results show that the proposed method can realize accurate and automatic abnormal data detection and achieve a balance between the filling accuracy and speed of missing data in a distribution network. This has good engineering application value. © 2023 Power System Protection and Control Press. All rights reserved.
引用
收藏
页码:158 / 169
相关论文
共 50 条
  • [1] Classification and Prediction of Network Abnormal Data Based on Machine Learning
    Ren, Bin
    Hu, Ming
    Yan, Hui
    Yu, Ping
    [J]. 2019 INTERNATIONAL CONFERENCE ON ROBOTS & INTELLIGENT SYSTEM (ICRIS 2019), 2019, : 273 - 276
  • [2] Approximate Imputation Method for Missing Data in Machine Learning
    [J]. 1600, Xi'an Jiaotong University (51):
  • [3] Analysis of Machine Learning Based Imputation of Missing Data
    Rizvi, Syed Tahir Hussain
    Latif, Muhammad Yasir
    Amin, Muhammad Saad
    Telmoudi, Achraf Jabeur
    Shah, Nasir Ali
    [J]. CYBERNETICS AND SYSTEMS, 2023,
  • [4] ExtraImpute: A Novel Machine Learning Method for Missing Data Imputation
    Alabadla, Mustafa
    Sidi, Fatimah
    Ishak, Iskandar
    Ibrahim, Hamidah
    Affendey, Lilly Suriani
    Hamdan, Hazlina
    [J]. JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2022, 13 (05) : 470 - 476
  • [5] A survey on missing data in machine learning
    Tlamelo Emmanuel
    Thabiso Maupong
    Dimane Mpoeleng
    Thabo Semong
    Banyatsang Mphago
    Oteng Tabona
    [J]. Journal of Big Data, 8
  • [6] A survey on missing data in machine learning
    Emmanuel, Tlamelo
    Maupong, Thabiso
    Mpoeleng, Dimane
    Semong, Thabo
    Mphago, Banyatsang
    Tabona, Oteng
    [J]. JOURNAL OF BIG DATA, 2021, 8 (01)
  • [7] Data fusion method for wireless sensor network based on machine learning
    Wu, Mi
    [J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2023, 23 (01) : 361 - 373
  • [8] An Imputation Method for Missing Data Based on an Extreme Learning Machine Auto-Encoder
    Lu, Cheng-Bo
    Mei, Ying
    [J]. IEEE ACCESS, 2018, 6 : 52930 - 52935
  • [9] An Association Rules-Based Method for Outliers Cleaning of Measurement Data in the Distribution Network
    Kuang, Hua
    Qin, Risheng
    He, Mi
    He, Xin
    Duan, Ruimin
    Guo, Cheng
    Meng, Xian
    [J]. FRONTIERS IN ENERGY RESEARCH, 2021, 9
  • [10] Similarity detection method of abnormal data in network based on data mining
    Sun, Xiang
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (01) : 155 - 162