A novel ensemble over-sampling approach based Chebyshev inequality for imbalanced multi-label data

被引:0
|
作者
Ren, Weishuo [1 ,2 ]
Zheng, Yifeng [1 ,2 ]
Zhang, Wenjie [1 ,2 ]
Qing, Depeng [1 ,2 ]
Zeng, Xianlong [1 ,2 ]
Li, Guohe [3 ]
机构
[1] Minnan Normal Univ, Sch Comp Sci, Zhangzhou 363000, Fujian, Peoples R China
[2] Fujian Prov Univ, Key Lab Data Sci & Intelligence Applicat, Zhangzhou 363000, Fujian, Peoples R China
[3] China Univ Petr, Coll Informat Sci & Engn, Beijing 102249, Peoples R China
关键词
Multi-label classification; Imbalanced data; Over-sampling approach; Chebyshev inequality; Group optimization strategy; CLASSIFICATION;
D O I
10.1016/j.neucom.2024.128717
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the development of intelligent technology, data exhibits characteristics of multi-label and imbalanced distribution, which lead to the degradation of classification model performance. Therefore, addressing multi- label class imbalance has become a hot research topic. Nowadays, over-sampling approaches aim to generate a superset of the original dataset to deal with imbalanced data. However, traditional over-sampling methods only employ the central data point and its nearest neighbor samples to synthesize samples without considering the impact of data distribution. To address these issues, in this paper, we propose an ensemble multi- label over-sampling algorithm (MLCIO) based on Chebyshev inequality and a group optimization strategy. Firstly, to generate more representative and diverse samples, with the seed sample serving as the sphere's center, Chebyshev inequality is utilized to ensure that synthetic samples fall within its m times the standard deviation. Secondly, a group optimization ranking weighting approach is employed to obtain more reliable and stable label information. Finally, comparative experiments are conducted on 11 imbalanced datasets from various domains using different evaluation metrics. The results demonstrate that our proposal achieves better performance than other approaches.
引用
收藏
页数:16
相关论文
共 50 条
  • [11] Imbalanced Data Over-Sampling Method Based on ISODATA Clustering
    Lv, Zhenzhe
    Liu, Qicheng
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (09) : 1528 - 1536
  • [12] BCGAN-based Over-sampling Scheme for Imbalanced Data
    Son, Minjae
    Jung, Seungwon
    Moon, Jihoon
    Hwang, Eenjun
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2020), 2020, : 155 - 160
  • [13] Over-sampling algorithm for imbalanced data classification
    XU Xiaolong
    CHEN Wen
    SUN Yanfei
    JournalofSystemsEngineeringandElectronics, 2019, 30 (06) : 1182 - 1191
  • [14] Hierarchical multi-label classification based on over-sampling and hierarchy constraint for gene function prediction
    Chen, Benhui
    Hu, Jinglu
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2012, 7 (02) : 183 - 189
  • [15] A Learning Approach with Under-and Over-sampling for Imbalanced Data Sets
    Yeh, Chun-Wu
    Li, Der-Chiang
    Lin, Liang-Sian
    Tsai, Tung-I
    PROCEEDINGS 2016 5TH IIAI INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS IIAI-AAI 2016, 2016, : 725 - 729
  • [16] Abstention-SMOTE: An over-sampling approach for imbalanced data classification
    Zhang, Cheng
    Chen, Yufei
    Liu, Xianhui
    Zhao, Xiaodong
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY (ICIT 2017), 2017, : 17 - 21
  • [17] Ensemble based adaptive over-sampling method for imbalanced data learning in computer aided detection of microaneurysm
    Ren, Fulong
    Cao, Peng
    Li, Wei
    Zhao, Dazhe
    Zaiane, Osmar
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2017, 55 : 54 - 67
  • [18] RWO-Sampling: A random walk over-sampling approach to imbalanced data classification
    Zhang, Huaxiang
    Li, Mingfang
    INFORMATION FUSION, 2014, 20 : 99 - 116
  • [19] A New Over-sampling Technique Based on SVM for Imbalanced Diseases Data
    Wang, Jinjin
    Yao, Yukai
    Zhou, Hanhai
    Leng, Mingwei
    Chen, Xiaoyun
    PROCEEDINGS 2013 INTERNATIONAL CONFERENCE ON MECHATRONIC SCIENCES, ELECTRIC ENGINEERING AND COMPUTER (MEC), 2013, : 1224 - 1228
  • [20] Dynamic weighted majority based on over-sampling for imbalanced data streams
    Du, Hongle
    Thelma, Palaoag
    2021 THE 4TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS, CIIS 2021, 2021, : 87 - 95