CLUSTERING-BASED SUBSET ENSEMBLE LEARNING METHOD FOR IMBALANCED DATA

被引:0
|
作者
Hu, Xiao-Sheng [1 ]
Zhang, Run-Jing [2 ]
机构
[1] Foshan Univ, Coll Elect & Informat Engn, Foshan 528000, Peoples R China
[2] Foshan Univ, Informat & Educ Technol Ctr, Foshan 528000, Peoples R China
关键词
Imbalanced data; Classification; Clustering; Ensemble learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent research, classification involving imbalanced datasets has received considerable attention. Most classification algorithms tend to predict that most of the incoming data belongs to the majority class, resulting in the poor classification performance in minority class instances, which are usually of much more interest. In this paper we propose a clustering-based subset ensemble learning method for handling class imbalanced problem. In the proposed approach, first, new balanced training datasets are produced using clustering-based under-sampling, then, further classification of new training sets are performed by applying four algorithms: Decision Tree, Naive Bayes, KNN and SVM, as the base algorithms in combined-bagging. An experimental analysis is carried out over a wide range of highly imbalanced data sets. The results obtained show that our method can improve imbalance classification performance of rare and normal classes stably and effectively.
引用
下载
收藏
页码:35 / 39
页数:5
相关论文
共 50 条
  • [41] Clustering-based Active Learning Classification towards Data Stream
    Yin, Chunyong
    Chen, Shuangshuang
    Yin, Zhichao
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (02)
  • [42] A combination of clustering-based under-sampling with ensemble methods for solving imbalanced class problem in intelligent systems
    Shahabadi, Mohammad Saleh Ebrahimi
    Tabrizchi, Hamed
    Rafsanjani, Marjan Kuchaki
    Gupta, B. B.
    Palmieri, Francesco
    TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2021, 169
  • [43] A Clustering-Based Deep Learning Method for Water Level Prediction
    Wang, Chih-Ping
    Liu, Duen-Ren
    IEICE Transactions on Information and Systems, 2024, E107.D (12) : 1538 - 1541
  • [44] Multicriteria Classifier Ensemble Learning for Imbalanced Data
    Wegier, Weronika
    Koziarski, Michal
    Wozniak, Micha
    IEEE ACCESS, 2022, 10 : 16807 - 16818
  • [45] Multicriteria Classifier Ensemble Learning for Imbalanced Data
    Wegier, Weronika
    Koziarski, Michal
    Wozniak, Micha
    Wegier, Weronika
    IEEE Access, 2022, 10 : 16807 - 16818
  • [46] An Improved Ensemble Learning for Imbalanced Data Classification
    Yuan, Zhengwu
    Zhao, Pu
    PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 408 - 411
  • [47] Entropy-based hybrid sampling ensemble learning for imbalanced data
    Dongdong, Li
    Ziqiu, Chi
    Bolu, Wang
    Zhe, Wang
    Hai, Yang
    Wenli, Du
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (07) : 3039 - 3067
  • [48] A synthetic neighborhood generation based ensemble learning for the imbalanced data classification
    Chen, Zhi
    Lin, Tao
    Xia, Xin
    Xu, Hongyan
    Ding, Sha
    APPLIED INTELLIGENCE, 2018, 48 (08) : 2441 - 2457
  • [49] Using Graph-Based Ensemble Learning to Classify Imbalanced Data
    Qin, Anyong
    Shang, Zhaowei
    Tian, Jinyu
    Zhang, Taiping
    Wang, Yulong
    Tang, Yuan Yan
    2017 3RD IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS (CYBCONF), 2017, : 265 - 270
  • [50] A Heterogeneous AdaBoost Ensemble Based Extreme Learning Machines for Imbalanced Data
    Abuassba, Adnan Omer
    Zhang, Dezheng
    Luo, Xiong
    INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 2019, 13 (03) : 19 - 35