An Incorrect Data Detection Method for Big Data Cleaning of Machinery Condition Monitoring

被引:99
|
作者
Xu, Xuefang [1 ]
Lei, Yaguo [1 ]
Li, Zeda [1 ]
机构
[1] Xi An Jiao Tong Univ, Key Lab Educ, Minist Modern Design & Rotor Bearing Syst, Xian 710049, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Condition-monitoring big data; data cleaning; data quality; incorrect data; local outlier factor (LOF); OUTLIER DETECTION; NETWORK;
D O I
10.1109/TIE.2019.2903774
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The presence of incorrect data leads to the decrease of condition-monitoring big data quality. As a result, unreliable or misleading results are probably obtained by analyzing these poor-quality data. In this paper, to improve the data quality, an incorrect data detection method based on an improved local outlier factor (LOF) is proposed for data cleaning. First, a sliding window technique is used to divide data into different segments. These segments are considered as different objects and their attributes consist of time-domain statistical features extracted from each segment, such as mean, maximum and peak-to-peak value. Second, a kernel-based LOF (KLOF) is calculated using these attributes to evaluate the degree of each segment being incorrect data. Third, according to these KLOF values and a threshold value, incorrect data are detected. Finally, a simulation of vibration data generated by a defective rolling element bearing and three real cases concerning a fixed-axle gearbox, a wind turbine, and a planetary gearbox are used to verify the effectiveness of the proposed method, respectively. The results demonstrate that the proposed method is able to detect both missing segments and abnormal segments, which are two typical incorrect data, effectively, and thus is helpful for big data cleaning of machinery condition monitoring.
引用
收藏
页码:2326 / 2336
页数:11
相关论文
共 50 条
  • [21] CleanCloud: Cleaning Big Data on Cloud
    Wang, Hongzhi
    Ding, Xiaoou
    Chen, Xiangying
    Li, Jianzhong
    Gao, Hong
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2543 - 2546
  • [22] Bivariate analysis of complex vibration data: An application to condition monitoring of rotating machinery
    Pennacchi, P.
    Vania, A.
    Bachschmid, N.
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2006, 20 (08) : 2340 - 2374
  • [23] Research and application of remote condition data acquisition method for construction machinery
    School of Mechanical Engineering, Shanghai Jiaotong University, Shanghai 200240, China
    Yi Qi Yi Biao Xue Bao, 2009, 4 (728-732): : 728 - 732
  • [24] A LOF-based method for abnormal segment detection in machinery condition monitoring
    Xu, Xuefang
    Lei, Yaguo
    Zhou, Xin
    2018 PROGNOSTICS AND SYSTEM HEALTH MANAGEMENT CONFERENCE (PHM-CHONGQING 2018), 2018, : 125 - 128
  • [25] Anomaly Detection on Data Streams for Machine Condition Monitoring
    Brandt, Tobias
    Grawunder, Marco
    Appelrath, Hans-Juergen
    2016 IEEE 14TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2016, : 1282 - 1287
  • [26] Current Transformer Condition Online Monitoring Platform Based on Big Data
    Wang, Dan
    Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)
  • [27] Research on Condition Monitoring of Power Big Data Based on Rough Sets
    Yan, Yulong
    Wu, Jilai
    Wu, Shejun
    Zhang, Jian
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON MATERIALS ENGINEERING AND INFORMATION TECHNOLOGY APPLICATIONS, 2015, 28 : 760 - 765
  • [28] Layered Encryption Method for Monitoring Network User Data for Big Data Analysis
    Qiao, Yanhua
    Zhao, Lin
    Li, Jianna
    ADVANCED HYBRID INFORMATION PROCESSING, ADHIP 2019, PT I, 2019, 301 : 84 - 93
  • [29] Enhancing Data Quality by Cleaning Inconsistent Big RDF Data
    Benbernou, Salima
    Ouziri, Mourad
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 74 - 79
  • [30] Exploring and cleaning big data with random sample data blocks
    Salman Salloum
    Joshua Zhexue Huang
    Yulin He
    Journal of Big Data, 6