Design of Data Standardization Cleaning System Under Multi-source Data Access

被引:1
|
作者
Li, Bo [1 ]
Zhao, Ruifeng [2 ]
Chen, Fengchao [3 ]
Zhang, Bo [1 ]
Zhou, Lide [3 ]
He, Yipeng [3 ]
Lu, Chengbo [3 ]
机构
[1] South China Univ Technol, Sch Elect Power, Guangzhou 510640, Guangdong, Peoples R China
[2] Power Dispatching & Control Ctr Guangdong Grid Co, Guangzhou 510600, Guangdong, Peoples R China
[3] Guangdong Power Grid Corp, Dongguan Power Supply Bur, Dongguan 523008, Guangdong, Peoples R China
关键词
Multi source data; Multi task optimization; Data cleaning; Massive data;
D O I
10.1007/978-3-030-99581-2_7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the poor initial clustering ability of data, the processing time of data standardization cleaning system for multi-source data is increasing. To solve this problem, the data standardization cleaning system with multi-source data access is designed. According to the characteristics of multi-source data, the preliminary clustering module is set to complete the data preprocessing. The data similarity is calculated to determine whether the data to be processed or not need to be cleaned. The traditional system data cleaning technology is used to process the data to complete the multi-source data cleaning. So far, the design of data standardization cleaning system under multi-source data access has been completed. The experimental results show that the speed of data missing value processing, the effect of data screening and data standardization processing are better, and the comprehensive performance of system data cleaning is better. Therefore, this system is more suitable for multi-source data processing.
引用
收藏
页码:59 / 67
页数:9
相关论文
共 50 条
  • [1] An Algorithm for Multi-Source Geographic Data System
    Lee, Chiang-Sheng
    Tsai, Hsine-Jen
    Chang, Yin-Yih
    PROGRESS IN SYSTEMS ENGINEERING, 2015, 366 : 373 - 375
  • [2] Unsupervised Multi-source Domain Adaptation Without Access to Source Data
    Ahmed, Sk Miraj
    Raychaudhuri, Dripta S.
    Paul, Sujoy
    Oymak, Samet
    Roy-Chowdhury, Amit K.
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10098 - 10107
  • [3] Cleaning of Multi-Source Uncertain Time Series Data Based on PageRank
    高嘉伟
    孙纪舟
    Journal of Donghua University(English Edition), 2023, 40 (06) : 695 - 700
  • [4] Design of Smart Site Supervision System Based on Multi-source Sensor Data
    Hu B.
    Li K.
    Computer-Aided Design and Applications, 2023, 20 (S11): : 93 - 104
  • [5] Multi-source Data Clustering
    Li, Tiancheng
    Corchado, Juan M.
    Bajo, Javier
    Sun, Shudong
    2015 18TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2015, : 830 - 837
  • [6] INTEGRATING MULTI-SOURCE IMAGERY DATA IN A GIS SYSTEM
    Liu, Qian
    3RD ISPRS IWIDF 2013, 2013, 40-7-W1 : 81 - 85
  • [7] Knowledge Discovery from Log Data Analysis in a Multi-source Search System based on Deep Cleaning
    Lebib, Fatma
    Mellah, Hakima
    Meziane, Abdelkrim
    WEBIST: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES, 2019, : 257 - 264
  • [8] EGRIDER GFS: A Distributed File System with Multi-source Data Access and Replication for Grid Computing
    Chen, Chun-Ting
    Hsu, Chun-Chen
    Wu, Jan-Jan
    Liu, Pangfeng
    ADVANCES IN GRID AND PERVASIVE COMPUTING, PROCEEDINGS, 2009, 5529 : 119 - +
  • [9] Multi-source Data Collection Data Security Analysis
    Ma, Lei
    Li, Yunwei
    ADVANCED HYBRID INFORMATION PROCESSING, ADHIP 2022, PT II, 2023, 469 : 458 - 472
  • [10] Efficient multi-source data transfer in Data Grids
    Wang, Chien-Min
    Hsu, Chun-Chen
    Chen, Hsi-Min
    Wu, Jan-Jan
    SIXTH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID: SPANNING THE WORLD AND BEYOND, 2006, : 421 - +