A Big Data Online Cleaning Algorithm Based on Dynamic Outlier Detection

被引:8
|
作者
Diao, Yinglong [1 ]
Liu, Ke-yan [1 ]
Meng, Xiaoli [1 ]
Ye, Xueshun [1 ]
He, Kaiyuan [1 ]
机构
[1] China Elect Power Res Inst, Power Distribut Dept, Beijing, Peoples R China
关键词
component; online cleaning; deviation detection; dynamic outlier detection; big data;
D O I
10.1109/CyberC.2015.68
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To effectively clean the large-scale, mixed and inaccurate monitoring or collective data, reduce the cost of data cache and ensure the consistent deviation detection on timing data of each cycle, a big data online cleaning algorithm based on dynamic outlier detection has been proposed. The data cleaning method is improved by local outliner detection upon density, sampling cluster uniformly dilution Euclidean distance matrix retaining some corrections into next cycle of cleaning, which avoids a sampling causing overall cleaning deviation and reduces amount of calculation within data cleaning stable time, enhancing the speed greatly. Finally, the distributed solutions on online cleaning algorithm based on Hadoop platform.
引用
收藏
页码:230 / 234
页数:5
相关论文
共 50 条
  • [1] Big Data Outlier Detection Algorithm Based on Grid
    Guo Wei-Wei
    Liu Feng
    [J]. 2018 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION (ICICTA 2018), 2018, : 274 - 277
  • [2] DP_DETECTION: An outlier detection algorithm based on density of big data
    Li, Xiaodi
    Deng, Ping
    Huang, Ming
    Li, Dingcheng
    Wang, Hongjun
    [J]. DATA SCIENCE AND KNOWLEDGE ENGINEERING FOR SENSING DECISION SUPPORT, 2018, 11 : 534 - 544
  • [3] A distributed density-based outlier detection algorithm on big data
    Mei, Lin
    Zhang, Fengli
    [J]. International Journal of Network Security, 2020, 22 (05): : 775 - 781
  • [4] An efficient algorithm for distributed density-based outlier detection on big data
    Bai, Mei
    Wang, Xite
    Xin, Junchang
    Wang, Guoren
    [J]. NEUROCOMPUTING, 2016, 181 : 19 - 28
  • [5] Big data outlier detection model based on improved density peak algorithm
    Shao, Mengliang
    Qi, Deyu
    Xue, Huili
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (04) : 6185 - 6194
  • [6] Research on Subway Pedestrian Detection Algorithm Based on Big Data Cleaning Technology
    Lyu, Zhuoyang
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
  • [7] Efficient Clustering-Based Outlier Detection Algorithm for Dynamic Data Stream
    Elahi, Manzoor
    Li, Kun
    Nisar, Wasif
    Lv, Xinjie
    Wang, Hongan
    [J]. FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 5, PROCEEDINGS, 2008, : 298 - 304
  • [8] A Practical Algorithm to Outlier Detection and Data Cleaning for the Time-dependent Signal
    Pan Tianhong
    Huang Biao
    Khare, Swanand
    [J]. 2013 32ND CHINESE CONTROL CONFERENCE (CCC), 2013, : 1676 - 1679
  • [9] An Outlier Detection Algorithm Based on the Degree of Sharpness and Its Applications on Traffic Big Data Preprocessing
    Wang, Zhonghao
    Huang, Xiyang
    Song, Yan
    Xiao, Jianli
    [J]. 2017 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA), 2017, : 483 - 487
  • [10] An Explainable Outlier Detection-based Data Cleaning Approach for Intrusion Detection
    Ha, Theodore
    Shao, Sicong
    Hariri, Salim
    [J]. 2023 20TH ACS/IEEE INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, AICCSA, 2023,