CLUSTERING-BASED SUBSET ENSEMBLE LEARNING METHOD FOR IMBALANCED DATA

被引:0
|
作者
Hu, Xiao-Sheng [1 ]
Zhang, Run-Jing [2 ]
机构
[1] Foshan Univ, Coll Elect & Informat Engn, Foshan 528000, Peoples R China
[2] Foshan Univ, Informat & Educ Technol Ctr, Foshan 528000, Peoples R China
关键词
Imbalanced data; Classification; Clustering; Ensemble learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent research, classification involving imbalanced datasets has received considerable attention. Most classification algorithms tend to predict that most of the incoming data belongs to the majority class, resulting in the poor classification performance in minority class instances, which are usually of much more interest. In this paper we propose a clustering-based subset ensemble learning method for handling class imbalanced problem. In the proposed approach, first, new balanced training datasets are produced using clustering-based under-sampling, then, further classification of new training sets are performed by applying four algorithms: Decision Tree, Naive Bayes, KNN and SVM, as the base algorithms in combined-bagging. An experimental analysis is carried out over a wide range of highly imbalanced data sets. The results obtained show that our method can improve imbalance classification performance of rare and normal classes stably and effectively.
引用
下载
收藏
页码:35 / 39
页数:5
相关论文
共 50 条
  • [21] A Method of Imbalanced Traffic Classification Based on Ensemble Learning
    Ding, Yaojun
    2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2015, : 265 - 268
  • [22] imDC: an ensemble learning method for imbalanced classification with miRNA data
    Wang, C. Y.
    Hu, L. L.
    Guo, M. Z.
    Liu, X. Y.
    Zou, Q.
    GENETICS AND MOLECULAR RESEARCH, 2015, 14 (01): : 123 - 133
  • [23] Clustering-Based Federated Learning for Heterogeneous IoT Data
    Li, Shumin
    Wei, Linna
    Zhang, Weidong
    Wu, Xuangou
    2023 IEEE INTERNATIONAL CONFERENCES ON INTERNET OF THINGS, ITHINGS IEEE GREEN COMPUTING AND COMMUNICATIONS, GREENCOM IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING, CPSCOM IEEE SMART DATA, SMARTDATA AND IEEE CONGRESS ON CYBERMATICS,CYBERMATICS, 2024, : 172 - 179
  • [24] Method for Incomplete and Imbalanced Data Based on Multivariate Imputation by Chained Equations and Ensemble Learning
    Li, Jiaxi
    Wang, Zhelong
    Wu, Lina
    Qiu, Sen
    Zhao, Hongyu
    Lin, Fang
    Zhang, Ke
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (05) : 3102 - 3113
  • [25] Dynamic clustering method for imbalanced learning based on AdaBoost
    Xiaoheng Deng
    Yuebin Xu
    Lingchi Chen
    Weijian Zhong
    Alireza Jolfaei
    Xi Zheng
    The Journal of Supercomputing, 2020, 76 : 9716 - 9738
  • [26] Graph Clustering-based Ensemble Method for Handwritten Text Line Segmentation
    Manohar, Vasant
    Vitaladevuni, Shiv N.
    Cao, Huaigu
    Prasad, Rohit
    Natarajan, Prem
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 574 - 578
  • [27] Dynamic clustering method for imbalanced learning based on AdaBoost
    Deng, Xiaoheng
    Xu, Yuebin
    Chen, Lingchi
    Zhong, Weijian
    Jolfaei, Alireza
    Zheng, Xi
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (12): : 9716 - 9738
  • [28] A Clustering-Based Method for Team Formation in Learning Environments
    Guijarro-Mata-Garcia, Marta
    Guijarro, Maria
    Fuentes-Fernandez, Ruben
    Hybrid Artificial Intelligent Systems, 2016, 9648 : 475 - 486
  • [29] Clustering-based method for big spatial data partitioning
    Zein A.A.
    Dowaji S.
    Al-Khayatt M.I.
    Measurement: Sensors, 2023, 27
  • [30] An Incremental Clustering-Based Fault Detection Algorithm for Class-Imbalanced Process Data
    Kwak, Jueun
    Lee, Taehyung
    Kim, Chang Ouk
    IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, 2015, 28 (03) : 318 - 328