Manifold neighboring envelope sample generation mechanism for imbalanced ensemble classification

被引:1
|
作者
Wang, Yiwen [1 ]
Li, Yongming [1 ]
Shen, Yinghua [2 ]
Li, Fan [3 ]
Wang, Pin [1 ]
机构
[1] Chongqing Univ, Sch Microelect & Commun Engn, Chongqing 400044, Peoples R China
[2] Chongqing Univ, Sch Econ & Business Adm, Chongqing 400044, Peoples R China
[3] Chongqing Jiaotong Univ, Sch Informat Sci & Engn, Chongqing 40044, Peoples R China
基金
中国国家自然科学基金;
关键词
Imbalanced classification problems; Imbalanced ensemble classification; Correlation information; Envelope sample; Fuzzy c -means clustering; Domain adaptation; PREDICTION;
D O I
10.1016/j.ins.2024.121103
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For existing imbalanced ensemble (IE) methods, the sample subsets are constructed from the same dataset, which usually suffer from low quality (diversity and separability) of the subsets, so a manifold neighboring envelope sample generation mechanism (MNESG) and an imbalanced ensemble algorithm based on the mechanism (MNESG-IE) are proposed to solve this problem. First, for the original balanced subsets (OBS), a manifold neighboring sample envelope projection mechanism (MNSEP) is designed to mine the local correlation information between the samples and their nearest neighbors in the subsets. Second, the fuzzy c-means clustering (FCM) is used to further mine the global correlation information among similar samples in the subsets. Third, the sample distribution consistency preservation mechanism (SDCPM) is designed to enhance the consistency of the sample distribution before and after clustering. To better reduce the three accumulated losses above, the three steps are conducted simultaneously, thereby realizing the MNESG, which can transform the OBS into two new types of high quality envelope sample subsets - neighboring envelope sample (NES) subsets and neighboring cluster envelope sample (NCES) subsets. Finally, base classifiers are trained on the NES subsets and NCES subsets, and then fused by a two-dimensional sparse fusion mechanism (2D-SFM). Various representative IE algorithms on over thirty benchmark datasets are considered for verification. The results show that compared with the state-of-the-art IE algorithms, MNESG-IE achieves 17.79%, 17.90%, 23.61%, 18.08% improvement in terms of ACC, AUC, F-M and G-M, respectively. The major originality of the paper is: (a) proposing the MNSEP to mine the local correlation information for improving the quality of the subsets; (b) proposing the MNESG to generate high quality subsets by mining local and global correlation information simultaneously; and (c)forming an IE algorithm to better solve the imbalanced classification problem.
引用
收藏
页数:28
相关论文
共 50 条
  • [41] Imbalanced sample feature enhancement of hyperspectral imagery classification
    Yu, Xumin
    Feng, Yan
    Gao, Yanlong
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 93 - 99
  • [42] Classifier Ensemble Design for Imbalanced Data Classification: A Hybrid Approach
    Salunkhe, Uma R.
    Mali, Suresh N.
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL MODELLING AND SECURITY (CMS 2016), 2016, 85 : 725 - 732
  • [43] Graph Ensemble Boosting for Imbalanced Noisy Graph Stream Classification
    Pan, Shirui
    Wu, Jia
    Zhu, Xingquan
    Zhang, Chengqi
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (05) : 940 - 954
  • [44] imDC: an ensemble learning method for imbalanced classification with miRNA data
    Wang, C. Y.
    Hu, L. L.
    Guo, M. Z.
    Liu, X. Y.
    Zou, Q.
    GENETICS AND MOLECULAR RESEARCH, 2015, 14 (01): : 123 - 133
  • [45] Equalization ensemble for large scale highly imbalanced data classification
    Ren, Jinjun
    Wang, Yuping
    Mao, Mingqian
    Cheung, Yiu-ming
    KNOWLEDGE-BASED SYSTEMS, 2022, 242
  • [46] Neural Network Ensemble With Evolutionary Algorithm for Highly Imbalanced Classification
    Sun, Poly Z. H.
    Zuo, Tian-Yu
    Law, Rob
    Wu, Edmond Q.
    Song, Aiguo
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (05): : 1394 - 1404
  • [47] Cost-Sensitive Ensemble Learning for Highly Imbalanced Classification
    Johnson, Justin M.
    Khoshgoftaar, Taghi M.
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1427 - 1434
  • [48] Hellinger Distance Weighted Ensemble for imbalanced data stream classification
    Grzyb, Joanna
    Klikowski, Jakub
    Wozniak, Michal
    JOURNAL OF COMPUTATIONAL SCIENCE, 2021, 51
  • [49] Ensemble classification algorithm based improved SMOTE for imbalanced data
    Ning, Liu, 1600, Natsional'nyi Hirnychyi Universytet
  • [50] An Ensemble Classification Model Based on Imbalanced Data for Aviation Safety
    NI Xiaomei
    WANG Huawei
    LV Shaolan
    XIONG Minglan
    Wuhan University Journal of Natural Sciences, 2021, 26 (05) : 437 - 443