A two-stage stacked-based heterogeneous ensemble learning for cancer survival prediction

被引:8
|
作者
Yan, Fangzhou [1 ]
Feng, Yi [2 ]
机构
[1] Sichuan Univ, Coll Elect Engn, Chengdu 610064, Peoples R China
[2] Sichuan Univ, Business Sch, Chengdu 610064, Peoples R China
关键词
Stacked generalization strategy; Cancer survival prediction; Feature selection; Heterogeneous ensemble learning; MODEL; ALGORITHM; CLASSIFICATION; OPTIMIZATION; DIAGNOSIS; SYSTEM;
D O I
10.1007/s40747-022-00791-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cancer survival prediction is one of the three major tasks of cancer prognosis. To improve the accuracy of cancer survival prediction, in this paper, we propose a priori knowledge- and stability-based feature selection (PKSFS) method and develop a novel two-stage heterogeneous stacked ensemble learning model (BQAXR) to predict the survival status of cancer patients. Specifically, PKSFS first obtains the optimal feature subsets from the high-dimensional cancer datasets to guide the subsequent model construction. Then, BQAXR seeks to generate five high-quality heterogeneous learners, among which the shortcomings of the learners are overcome by using improved methods, and integrate them in two stages through the stacked generalization strategy based on optimal feature subsets. To verify the merits of PKSFS and BQAXR, this paper collected the real survival datasets of gastric cancer and skin cancer from the Surveillance, Epidemiology, and End Results (SEER) database of the National Cancer Institute, and conducted extensive numerical experiments from different perspectives based on these two datasets. The accuracy and AUC of the proposed method are 0.8209 and 0.8203 in the gastric cancer dataset, and 0.8336 and 0.8214 in the skin cancer dataset. The results show that PKSFS has marked advantages over popular feature selection methods in processing high-dimensional datasets. By taking full advantage of heterogeneous high-quality learners, BQAXR is not only superior to mainstream machine learning methods, but also outperforms improved machine learning methods, which indicates can effectively improve the accuracy of cancer survival prediction and provide a reference for doctors to make medical decisions.
引用
收藏
页码:4619 / 4639
页数:21
相关论文
共 50 条
  • [41] Data-driven decision model based on local two-stage weighted ensemble learning
    Che Xu
    Wenjun Chang
    Weiyong Liu
    Annals of Operations Research, 2023, 325 : 995 - 1028
  • [42] Two-Stage PNN-SVM Ensemble for Higher Education Admission Prediction
    Zub, Khrystyna
    Zhezhnych, Pavlo
    Strauss, Christine
    BIG DATA AND COGNITIVE COMPUTING, 2023, 7 (02)
  • [43] Stacked regression ensemble for cancer class prediction
    Sehgal, MSB
    Gondal, I
    Dooley, L
    2005 3rd IEEE International Conference on Industrial Informatics (INDIN), 2005, : 831 - 835
  • [44] Stacked-based machine learning to predict the uniaxial compressive strength of concrete materials
    Hamed, Abdelrahman Kamal
    Elshaarawy, Mohamed Kamel
    Alsaadawi, Mostafa M.
    COMPUTERS & STRUCTURES, 2025, 308
  • [45] Prediction in Traffic Accident Duration Based on Heterogeneous Ensemble Learning
    Zhao, Yuexu
    Deng, Wei
    APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
  • [46] Prediction of rhinitis with class imbalance based on heterogeneous ensemble learning
    Yang, Jingdong
    Jiang, Biao
    Qiu, Zehao
    Meng, Yifei
    Zhang, Xiaolin
    Yu, Shaoqing
    Dai, Fu
    Qian, Yue
    COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING, 2024,
  • [47] Two-Stage Classification Method for MSI Status Prediction Based on Deep Learning Approach
    Lee, Hyunseok
    Seo, Jihyun
    Lee, Giwan
    Park, Jongoh
    Yeo, Doyeob
    Hong, Ayoung
    APPLIED SCIENCES-BASEL, 2021, 11 (01): : 1 - 11
  • [48] Air pollutant prediction model based on transfer learning two-stage attention mechanism
    Ma, Zhanfei
    Wang, Bisheng
    Luo, Wenli
    Jiang, Jing
    Liu, Dongxiang
    Wei, Hui
    Luo, Haoye
    SCIENTIFIC REPORTS, 2024, 14 (01)
  • [49] Business failure prediction based on two-stage selective ensemble with manifold learning algorithm and kernel-based fuzzy self-organizing map
    Wang, Lu
    Wu, Chong
    KNOWLEDGE-BASED SYSTEMS, 2017, 121 : 99 - 110
  • [50] Two-stage hemoglobin prediction based on prior causality
    Chen, Yuwen
    Zhong, Kunhua
    Zhu, Yiziting
    Sun, Qilong
    FRONTIERS IN PUBLIC HEALTH, 2022, 10