An Experimental Analysis of Drift Detection Methods on Multi-Class Imbalanced Data Streams

被引:3
|
作者
Palli, Abdul Sattar [1 ,2 ]
Jaafar, Jafreezal [1 ,3 ]
Gomes, Heitor Murilo [4 ,5 ]
Hashmani, Manzoor Ahmed [1 ,6 ]
Gilal, Abdul Rehman [1 ]
机构
[1] Univ Teknol PETRONAS, Dept Comp & Informat Sci, Tronoh 32610, Perak, Malaysia
[2] Minist Narcot Control, Anti Narcot Force, Rawalpindi 46000, Pakistan
[3] Univ Teknol PETRONAS, Ctr Res Data Sci, Tronoh 32610, Perak, Malaysia
[4] Victoria Univ Wellington, Sch Engn & Comp Sci, Wellington 6012, New Zealand
[5] Univ Waikato, AI Inst, Wellington 3240, New Zealand
[6] Univ Teknol PETRONAS, High Performance Cloud Comp Ctr HPC3, Tronoh 32610, Perak, Malaysia
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 22期
关键词
fault detection; concept drift; drift detection; class imbalance; multi-class classification; data stream mining; ONLINE; TESTS;
D O I
10.3390/app122211688
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Featured Application The industrial sensor-based application generates continuous non-stationary data streams which change over time. By analyzing the performance of existing change detection methods, the selection of the best performing method can be achieved for application in an industrial environment to early detect the fault or unusual change and to reduce the maintenance cost. The performance of machine learning models diminishes while predicting the Remaining Useful Life (RUL) of the equipment or fault prediction due to the issue of concept drift. This issue is aggravated when the problem setting comprises multi-class imbalanced data. The existing drift detection methods are designed to detect certain drifts in specific scenarios. For example, the drift detector designed for binary class data may not produce satisfactory results for applications that generate multi-class data. Similarly, the drift detection method designed for the detection of sudden drift may struggle with detecting incremental drift. Therefore, in this experimental investigation, we seek to investigate the performance of the existing drift detection methods on multi-class imbalanced data streams with different drift types. For this reason, this study simulated the streams with various forms of concept drift and the multi-class imbalance problem to test the existing drift detection methods. The findings of current study will aid in the selection of drift detection methods for use in developing solutions for real-time industrial applications that encounter similar issues. The results revealed that among the compared methods, DDM produced the best average F1 score. The results also indicate that the multi-class imbalance causes the false alarm rate to increase for most of the drift detection methods.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Concept Drift Detection from Multi-Class Imbalanced Data Streams
    Korycki, Lukasz
    Krawczyk, Bartosz
    [J]. 2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 1068 - 1079
  • [2] Boosting methods for multi-class imbalanced data classification: an experimental review
    Jafar Tanha
    Yousef Abdi
    Negin Samadi
    Nazila Razzaghi
    Mohammad Asadpour
    [J]. Journal of Big Data, 7
  • [3] Boosting methods for multi-class imbalanced data classification: an experimental review
    Tanha, Jafar
    Abdi, Yousef
    Samadi, Negin
    Razzaghi, Nazila
    Asadpour, Mohammad
    [J]. JOURNAL OF BIG DATA, 2020, 7 (01)
  • [4] A survey of multi-class imbalanced data classification methods
    Han, Meng
    Li, Ang
    Gao, Zhihui
    Mu, Dongliang
    Liu, Shujuan
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 2471 - 2501
  • [5] Multi-class Boosting for Imbalanced Data
    Fernandez-Baldera, Antonio
    Buenaposada, Jose M.
    Baumela, Luis
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2015), 2015, 9117 : 57 - 64
  • [6] Multi-class WHMBoost: An ensemble algorithm for multi-class imbalanced data
    Zhao, Jiakun
    Jin, Ju
    Zhang, Yibo
    Zhang, Ruifeng
    Chen, Si
    [J]. INTELLIGENT DATA ANALYSIS, 2022, 26 (03) : 599 - 614
  • [7] Evaluating Difficulty of Multi-class Imbalanced Data
    Lango, Mateusz
    Napierala, Krystyna
    Stefanowski, Jerzy
    [J]. FOUNDATIONS OF INTELLIGENT SYSTEMS, ISMIS 2017, 2017, 10352 : 312 - 322
  • [8] Survey on Highly Imbalanced Multi-class Data
    Hamid, Hakim Abdul
    Yusoff, Marina
    Mohamed, Azlinah
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (06) : 211 - 229
  • [9] Performance Analysis of Binarization Strategies for Multi-class Imbalanced Data Classification
    Zak, Michal
    Wozniak, Michal
    [J]. COMPUTATIONAL SCIENCE - ICCS 2020, PT IV, 2020, 12140 : 141 - 155
  • [10] iMCOD: Incremental multi-class outlier detection model in data streams
    Degirmenci, Ali
    Karal, Omer
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 258