An Incorrect Data Detection Method for Big Data Cleaning of Machinery Condition Monitoring

被引:99
|
作者
Xu, Xuefang [1 ]
Lei, Yaguo [1 ]
Li, Zeda [1 ]
机构
[1] Xi An Jiao Tong Univ, Key Lab Educ, Minist Modern Design & Rotor Bearing Syst, Xian 710049, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Condition-monitoring big data; data cleaning; data quality; incorrect data; local outlier factor (LOF); OUTLIER DETECTION; NETWORK;
D O I
10.1109/TIE.2019.2903774
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The presence of incorrect data leads to the decrease of condition-monitoring big data quality. As a result, unreliable or misleading results are probably obtained by analyzing these poor-quality data. In this paper, to improve the data quality, an incorrect data detection method based on an improved local outlier factor (LOF) is proposed for data cleaning. First, a sliding window technique is used to divide data into different segments. These segments are considered as different objects and their attributes consist of time-domain statistical features extracted from each segment, such as mean, maximum and peak-to-peak value. Second, a kernel-based LOF (KLOF) is calculated using these attributes to evaluate the degree of each segment being incorrect data. Third, according to these KLOF values and a threshold value, incorrect data are detected. Finally, a simulation of vibration data generated by a defective rolling element bearing and three real cases concerning a fixed-axle gearbox, a wind turbine, and a planetary gearbox are used to verify the effectiveness of the proposed method, respectively. The results demonstrate that the proposed method is able to detect both missing segments and abnormal segments, which are two typical incorrect data, effectively, and thus is helpful for big data cleaning of machinery condition monitoring.
引用
收藏
页码:2326 / 2336
页数:11
相关论文
共 50 条
  • [31] Exploring and cleaning big data with random sample data blocks
    Salloum, Salman
    Huang, Joshua Zhexue
    He, Yulin
    JOURNAL OF BIG DATA, 2019, 6 (01)
  • [32] Enhancing Recall Using Data Cleaning for Biomedical Big Data
    Deshpande, Priya
    Rasin, Alexander
    Tchoua, Roselyne
    Furst, Jacob
    Raicu, Daniela A.
    Antani, Sameer
    2020 IEEE 33RD INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS(CBMS 2020), 2020, : 265 - 270
  • [33] Wind Turbine Condition Monitoring Using SCADA Data and Data Mining Method
    Pei, Yan
    Qian, Zheng
    Tao, Siyu
    Yu, Hao
    2018 INTERNATIONAL CONFERENCE ON POWER SYSTEM TECHNOLOGY (POWERCON), 2018, : 3760 - 3764
  • [34] Research on Subway Pedestrian Detection Algorithm Based on Big Data Cleaning Technology
    Lyu, Zhuoyang
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
  • [35] An incremental community detection method in social big data
    Wu, Zhenyu
    Chen, Jiaying
    Zhang, Yinuo
    2018 IEEE/ACM 5TH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING APPLICATIONS AND TECHNOLOGIES (BDCAT), 2018, : 136 - 141
  • [36] An effective information detection method for social big data
    Jinrong He
    Naixue Xiong
    Multimedia Tools and Applications, 2018, 77 : 11277 - 11305
  • [37] An effective information detection method for social big data
    He, Jinrong
    Xiong, Naixue
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (09) : 11277 - 11305
  • [38] Big Data Stream Anomaly Detection with Spectral Method for UWB Radar Data
    Yun, Ying
    Wang, Wei
    PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, 2015, 322 : 253 - 259
  • [39] Big Data Cleaning Algorithms in Cloud Computing
    Feng, Zhang
    Hui-Feng, Xue
    Dong-Sheng, Xu
    Yong-Heng, Zhang
    Fei, You
    INTERNATIONAL JOURNAL OF ONLINE ENGINEERING, 2013, 9 (03) : 77 - 81
  • [40] Cleanix: a Parallel Big Data Cleaning System
    Wang, Hongzhi
    Li, Mingda
    Bu, Yingyi
    Li, Jianzhong
    Gao, Hong
    Zhang, Jiacheng
    SIGMOD RECORD, 2015, 44 (04) : 35 - 40