Manifold neighboring envelope sample generation mechanism for imbalanced ensemble classification

被引:1
|
作者
Wang, Yiwen [1 ]
Li, Yongming [1 ]
Shen, Yinghua [2 ]
Li, Fan [3 ]
Wang, Pin [1 ]
机构
[1] Chongqing Univ, Sch Microelect & Commun Engn, Chongqing 400044, Peoples R China
[2] Chongqing Univ, Sch Econ & Business Adm, Chongqing 400044, Peoples R China
[3] Chongqing Jiaotong Univ, Sch Informat Sci & Engn, Chongqing 40044, Peoples R China
基金
中国国家自然科学基金;
关键词
Imbalanced classification problems; Imbalanced ensemble classification; Correlation information; Envelope sample; Fuzzy c -means clustering; Domain adaptation; PREDICTION;
D O I
10.1016/j.ins.2024.121103
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For existing imbalanced ensemble (IE) methods, the sample subsets are constructed from the same dataset, which usually suffer from low quality (diversity and separability) of the subsets, so a manifold neighboring envelope sample generation mechanism (MNESG) and an imbalanced ensemble algorithm based on the mechanism (MNESG-IE) are proposed to solve this problem. First, for the original balanced subsets (OBS), a manifold neighboring sample envelope projection mechanism (MNSEP) is designed to mine the local correlation information between the samples and their nearest neighbors in the subsets. Second, the fuzzy c-means clustering (FCM) is used to further mine the global correlation information among similar samples in the subsets. Third, the sample distribution consistency preservation mechanism (SDCPM) is designed to enhance the consistency of the sample distribution before and after clustering. To better reduce the three accumulated losses above, the three steps are conducted simultaneously, thereby realizing the MNESG, which can transform the OBS into two new types of high quality envelope sample subsets - neighboring envelope sample (NES) subsets and neighboring cluster envelope sample (NCES) subsets. Finally, base classifiers are trained on the NES subsets and NCES subsets, and then fused by a two-dimensional sparse fusion mechanism (2D-SFM). Various representative IE algorithms on over thirty benchmark datasets are considered for verification. The results show that compared with the state-of-the-art IE algorithms, MNESG-IE achieves 17.79%, 17.90%, 23.61%, 18.08% improvement in terms of ACC, AUC, F-M and G-M, respectively. The major originality of the paper is: (a) proposing the MNSEP to mine the local correlation information for improving the quality of the subsets; (b) proposing the MNESG to generate high quality subsets by mining local and global correlation information simultaneously; and (c)forming an IE algorithm to better solve the imbalanced classification problem.
引用
收藏
页数:28
相关论文
共 50 条
  • [31] Rarity updated ensemble with oversampling: An ensemble approach to classification of imbalanced data streams
    Nouri, Zahra
    Kiani, Vahid
    Fadishei, Hamid
    STATISTICAL ANALYSIS AND DATA MINING, 2024, 17 (01)
  • [32] Ensemble Subsampling for Imbalanced Multivariate Two-Sample Tests
    Chen, Lisha
    Dou, Winston Wei
    Qiao, Zhihua
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2013, 108 (504) : 1308 - 1323
  • [33] Sample and feature selecting based ensemble learning for imbalanced problems
    Wang, Zhe
    Jia, Peng
    Xu, Xinlei
    Wang, Bolu
    Zhu, Yujin
    Li, Dongdong
    APPLIED SOFT COMPUTING, 2021, 113
  • [34] Discriminative Sample Generation for Deep Imbalanced Learning
    Guo, Ting
    Zhu, Xingquan
    Wang, Yang
    Chen, Fang
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2406 - 2412
  • [35] DBBoost: Enhancing Imbalanced Classification by a Novel Ensemble Based Technique
    Zhang, Chunkai
    Jia, Pengfei
    2014 INTERNATIONAL CONFERENCE ON MEDICAL BIOMETRICS (ICMB 2014), 2014, : 210 - 215
  • [36] A Combination of Resampling and Ensemble Method for Text Classification on Imbalanced Data
    Feng, Haijun
    Qin, Wen
    Wang, Huijing
    Li, Yi
    Hu, Guangwu
    BIG DATA, BIGDATA 2021, 2022, 12988 : 3 - 16
  • [37] Imbalanced Network Traffic Classification based on Ensemble Feature Selection
    Ding, Yaojun
    2016 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2016,
  • [38] Discriminative feature generation for classification of imbalanced data
    Suh, Sungho
    Lukowicz, Paul
    Lee, Yong Oh
    PATTERN RECOGNITION, 2022, 122
  • [39] A Novel Classification Method Based on Stacking Ensemble for Imbalanced Problems
    Wang, Zengshuai
    Zheng, Minhua
    Liu, Peter Xiaoping
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [40] Effective Sample Synthesizing in Kernel Space for Imbalanced Classification
    Mo, Wenwen
    He, Lianghua
    Wang, Yuqin
    Lu, Jian
    2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 432 - 438