SWSEL: Sliding Window-based Selective Ensemble Learning for class-imbalance problems

被引:7
|
作者
Dai, Qi [1 ]
Liu, Jian-wei [1 ,3 ]
Yang, Jia-Peng [2 ]
机构
[1] China Univ Petr, Coll Informat Sci & Engn, Dept Automat, Beijing, Peoples R China
[2] North China Univ Sci & Technol, Coll Sci, Tangshan, Peoples R China
[3] China Univ Petr, 260 Mailbox, Beijing 102249, Peoples R China
关键词
Sliding window; Selective ensemble learning; Ensemble learning; Distance metric; Imbalanced data; NETWORKS; ALGORITHMS; DIVERSITY; ACCURACY;
D O I
10.1016/j.engappai.2023.105959
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For class-imbalance problems, traditional supervised learning algorithms tend to favor majority instances (also called negative instances). Therefore, it is difficult for them to accurately identify the minority instances (also called positive instances). Ensemble learning is a common method to solve the class-imbalance problem. They build multiple classifier systems on the training dataset to improve the recognition accuracy of minority instances. Sliding window is a commonly used method for processing data stream. Few researchers have used sliding windows to select majority instances and construct ensemble learning models. Traditional ensemble learning methods use some or all of the majority instances for modeling by oversampling or undersampling. However, they also inherit the drawbacks of the preprocessing methods. Therefore, in this paper, we try to use similarity mapping to construct pseudo-sequences of majority instances. Then, according to the sliding window idea, we fully use all existing majority instances, and a novel sliding window-based selective ensemble learning method (SWSEL) is proposed to deal with the class-imbalance problem. This method uses the idea of distance alignment in multi-view alignment to align the centers of the minority instances with the majority instances, and slide to select the majority instances on the sequence of pseudo-majority instances. In addition, to prevent too many classifiers from leading to long running times, we use distance metric to select a certain number of base classifiers to build the final ensemble learning model. Extensive experimental results on various real-world datasets show that using SVM, MLP and RF as the base classifier, SWSEL achieves a statistically significant performance improvement on two evaluation metrics, AUC and G-mean, compared to state-of-the-art methods.
引用
下载
收藏
页数:20
相关论文
共 50 条
  • [21] Distributed Sparse Class-Imbalance Learning and Its Applications
    Maurya, Chandresh Kumar
    Toshniwal, Durga
    Venkoparao, Gopalan Vijendran
    IEEE TRANSACTIONS ON BIG DATA, 2021, 7 (05) : 832 - 844
  • [22] Exploratory under-sampling for class-imbalance learning
    Liu, Xu-Ying
    Wu, Jianxin
    Zhou, Zhi-Hua
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 965 - 969
  • [24] AWSMOTE: An SVM-Based Adaptive Weighted SMOTE for Class-Imbalance Learning
    Wang, Jia-Bao
    Zou, Chun-An
    Fu, Guang-Hui
    Scientific Programming, 2021, 2021
  • [25] AWSMOTE: An SVM-Based Adaptive Weighted SMOTE for Class-Imbalance Learning
    Wang, Jia-Bao
    Zou, Chun-An
    Fu, Guang-Hui
    SCIENTIFIC PROGRAMMING, 2021, 2021
  • [26] On Chance Performance in High-Dimensional Class-Imbalance Problems
    Udu, Amadi Gabriel
    Lecchini-Visintini, Andrea
    Dong, Hongbiao
    2024 UKACC 14TH INTERNATIONAL CONFERENCE ON CONTROL, CONTROL, 2024, : 254 - 255
  • [27] Distance mapping overlap complexity metric for class-imbalance problems
    Dai, Qi
    Liu, Jian-wei
    Shi, Yong-hui
    APPLIED SOFT COMPUTING, 2024, 163
  • [28] Class Imbalance Ensemble Learning Based on the Margin Theory
    Feng, Wei
    Huang, Wenjiang
    Ren, Jinchang
    APPLIED SCIENCES-BASEL, 2018, 8 (05):
  • [29] Towards Mitigating the Class-Imbalance Problem for Partial Label Learning
    Wang, Jing
    Zhang, Min-Ling
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 2427 - 2436
  • [30] Large-Scale Distributed Sparse Class-Imbalance Learning
    Maurya, Chandresh Kumar
    Toshniwal, Durga
    INFORMATION SCIENCES, 2018, 456 : 1 - 12