Anomaly detection-based undersampling for imbalanced classification problems

被引:0
|
作者
Park, You-Jin [1 ]
Brito, Paula [2 ,3 ]
Ma, Yun-Chen [1 ]
机构
[1] Natl Taipei Univ Technol, Dept Ind Engn & Management, Taipei City, Taiwan
[2] Univ Porto, Fac Econ, Porto, Portugal
[3] INESC TEC, LIAAD, Porto, Portugal
关键词
Machine learning; classification; class imbalance; anomaly; undersampling; SMOTE; NOISY;
D O I
10.1080/0305215X.2024.2315501
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In various machine learning applications, classification plays an important role in categorizing and predicting data. To improve the classification performance, it is crucial to identify and remove the anomalies. Also, class imbalance in many machine learning applications is a very common problem since most classifiers tend to be biased toward the majority class by ignoring the minority class instances. Thus, in this research, we propose a new under-sampling technique based on anomaly detection and removal to enhance the performance of imbalanced classification problems. To demonstrate the effectiveness of the proposed method, comprehensive experiments are conducted on forty imbalanced data sets and two non-parametric hypothesis tests are employed to show the statistical difference in classification performances between the proposed method and other traditional resampling methods. From the experiment, it is shown that the proposed method improves the classification performance by effectively detecting and eliminating the anomalies among true-majority or pseudo-majority class instances.
引用
收藏
页码:2565 / 2578
页数:14
相关论文
共 50 条
  • [1] An anomaly detection-based classification system
    Hou, Haiyu
    Dozier, Gerry
    2006 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-6, 2006, : 2223 - 2230
  • [2] LOCATION BAGGING-BASED UNDERSAMPLING FOR IMBALANCED CLASSIFICATION PROBLEMS
    Rong, Tongwen
    Tian, Xing
    Ng, Wing W. Y.
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION (ICWAPR), 2016, : 72 - 77
  • [3] Hashing-Based Undersampling Ensemble for Imbalanced Pattern Classification Problems
    Ng, Wing W. Y.
    Xu, Shichao
    Zhang, Jianjun
    Tian, Xing
    Rong, Tongwen
    Kwong, Sam
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (02) : 1269 - 1279
  • [4] WEIGHTED ENSEMBLE OF DIVERSIFIED SENSITIVITY-BASED UNDERSAMPLING FOR IMBALANCED PATTERN CLASSIFICATION PROBLEMS
    Chai, Yulin
    Zhang, Jianjun
    Ng, Wing W. Y.
    PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 1, 2017, : 42 - 47
  • [5] Radial-Based Undersampling for imbalanced data classification
    Koziarski, Michal
    PATTERN RECOGNITION, 2020, 102
  • [6] Anomaly detection-based condition monitoring
    Kas, M.
    Wamba, F. F.
    INSIGHT, 2022, 64 (08) : 453 - 458
  • [7] Overlap-Based Undersampling for Improving Imbalanced Data Classification
    Vuttipittayamongkol, Pattaramon
    Elyan, Eyad
    Petrovski, Andrei
    Jayne, Chrisina
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2018, PT I, 2018, 11314 : 689 - 697
  • [8] Evolutionary Undersampling for Imbalanced Big Data Classification
    Triguero, I.
    Galar, M.
    Vluymans, S.
    Cornelis, C.
    Bustince, H.
    Herrera, F.
    Saeys, Y.
    2015 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2015, : 715 - 722
  • [9] Subclass-based Undersampling for Class-imbalanced Image Classification
    Lehmann, Daniel
    Ebner, Marc
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 493 - 500
  • [10] Anomaly Detection-Based Unknown Face Presentation Attack Detection
    Baweja, Yashasvi
    Oza, Poojan
    Perera, Pramuditha
    Patel, Vishal M.
    IEEE/IAPR INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB 2020), 2020,