Collective of Base Classifiers for Mining Imbalanced Data

被引:0
|
作者
Jedrzejowicz, Joanna [1 ]
Jedrzejowicz, Piotr [2 ]
机构
[1] Univ Gdansk, Inst Informat, Fac Math Phys & Informat, PL-80308 Gdansk, Poland
[2] Gdynia Maritime Univ, Dept Informat Syst, PL-81225 Gdynia, Poland
关键词
Imbalanced data; Oversampling; Gene expression programming;
D O I
10.1007/978-3-031-08754-7_62
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Mining imbalanced datasets is a challenging and difficult problem. In this paper we adress it by proposing GEP-NB classifier based on the oversampling technique. It combines two learning methods - Gene Expression Programming and Naive Bayes, which cooperate to produce a final prediction. At the pre-processing stage a simple mechanism for generating synthetic minority class examples and balancing the training set is used. Next, two genes g1 and g2 are evolved using Gene Expression Programming. They differ by applying in each case a different procedure for selecting synthetic minority class examples. If the class prediction by g1 agrees with the class prediction made by g2, their decision is final. Otherwise the final predictive decision is taken by the Naive Bayes classifier. The approach is validated in an extensive computational experiment. Results produced by GEP-NB are compared with performance of several state-of-the-art classifiers. Comparisons show that GEP-NB offers a competitive performance.
引用
收藏
页码:571 / 585
页数:15
相关论文
共 50 条
  • [1] A Performance Analysis of Classifiers on Imbalanced Data
    Garcia, Nathan F.
    Strzoda, Romulo A.
    Lucca, Giancarlo
    Borges, Eduardo N.
    ICEIS: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS - VOL 1, 2022, : 602 - 609
  • [2] Evidential Combination of Classifiers for Imbalanced Data
    Niu, Jiawei
    Liu, Zhunga
    Lu, Yao
    Wen, Zaidao
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (12): : 7642 - 7653
  • [3] Data Mining on Imbalanced Data Sets
    Gu, Qiong
    Cai, Zhihua
    Zhu, Li
    Huang, Bo
    2008 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING, 2008, : 1020 - 1024
  • [4] Balanced Neighborhood Classifiers for Imbalanced Data Sets
    Zhu, Shunzhi
    Ma, Ying
    Pan, Weiwei
    Zhu, Xiatian
    Luo, Guangchun
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (12): : 3226 - 3229
  • [5] Evaluation of the Classifiers in Multiparameter and Imbalanced Data Sets
    Piotrowska, Ewelina
    INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, ISAT 2019, PT II, 2020, 1051 : 263 - 273
  • [6] Limitation of ROC in Evaluation of Classifiers for Imbalanced Data
    Movahedi, F.
    Antaki, J. F.
    JOURNAL OF HEART AND LUNG TRANSPLANTATION, 2021, 40 (04): : S413 - S413
  • [7] DYNAMIC SELECTION OF CLASSIFIERS FOR FUSING IMBALANCED HETEROGENEOUS DATA
    Sukhanov, S.
    Debes, C.
    Zoubir, A. M.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5361 - 5365
  • [8] Machine-learning classifiers for imbalanced tornado data
    Trafalis T.B.
    Adrianto I.
    Richman M.B.
    Lakshmivarahan S.
    Computational Management Science, 2014, 11 (4) : 403 - 418
  • [9] Constructing classifiers for imbalanced data using diversity optimisation
    Khorshidi, Hadi A.
    Aickelin, Uwe
    INFORMATION SCIENCES, 2021, 565 : 1 - 16
  • [10] Adaptive ensemble of classifiers with regularization for imbalanced data classification
    Wang, Chen
    Deng, Chengyuan
    Yu, Zhoulu
    Hui, Dafeng
    Gong, Xiaofeng
    Luo, Ruisen
    INFORMATION FUSION, 2021, 69 : 81 - 102