A two-stage stacked-based heterogeneous ensemble learning for cancer survival prediction

被引:8
|
作者
Yan, Fangzhou [1 ]
Feng, Yi [2 ]
机构
[1] Sichuan Univ, Coll Elect Engn, Chengdu 610064, Peoples R China
[2] Sichuan Univ, Business Sch, Chengdu 610064, Peoples R China
关键词
Stacked generalization strategy; Cancer survival prediction; Feature selection; Heterogeneous ensemble learning; MODEL; ALGORITHM; CLASSIFICATION; OPTIMIZATION; DIAGNOSIS; SYSTEM;
D O I
10.1007/s40747-022-00791-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cancer survival prediction is one of the three major tasks of cancer prognosis. To improve the accuracy of cancer survival prediction, in this paper, we propose a priori knowledge- and stability-based feature selection (PKSFS) method and develop a novel two-stage heterogeneous stacked ensemble learning model (BQAXR) to predict the survival status of cancer patients. Specifically, PKSFS first obtains the optimal feature subsets from the high-dimensional cancer datasets to guide the subsequent model construction. Then, BQAXR seeks to generate five high-quality heterogeneous learners, among which the shortcomings of the learners are overcome by using improved methods, and integrate them in two stages through the stacked generalization strategy based on optimal feature subsets. To verify the merits of PKSFS and BQAXR, this paper collected the real survival datasets of gastric cancer and skin cancer from the Surveillance, Epidemiology, and End Results (SEER) database of the National Cancer Institute, and conducted extensive numerical experiments from different perspectives based on these two datasets. The accuracy and AUC of the proposed method are 0.8209 and 0.8203 in the gastric cancer dataset, and 0.8336 and 0.8214 in the skin cancer dataset. The results show that PKSFS has marked advantages over popular feature selection methods in processing high-dimensional datasets. By taking full advantage of heterogeneous high-quality learners, BQAXR is not only superior to mainstream machine learning methods, but also outperforms improved machine learning methods, which indicates can effectively improve the accuracy of cancer survival prediction and provide a reference for doctors to make medical decisions.
引用
收藏
页码:4619 / 4639
页数:21
相关论文
共 50 条
  • [21] A two-stage prediction model for heterogeneous effects of treatments
    Chalkou, Konstantina
    Steyerberg, Ewout
    Egger, Matthias
    Manca, Andrea
    Pellegrini, Fabio
    Salanti, Georgia
    STATISTICS IN MEDICINE, 2021, 40 (20) : 4362 - 4375
  • [22] A two-stage sampling based ensemble learning method for hyperspectral image classification
    Peng, Yanbin
    Zheng, Zhijun
    Journal of Computational Information Systems, 2015, 11 (14): : 5135 - 5142
  • [23] A Two-Stage Learning Method for Response Prediction
    Chen, Kuan-Hsi
    Ting, Zih-Yun
    Shen, Jia-Ying
    Hu, Yuh-Jyh
    Liang, Tyne
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 1336 - 1341
  • [24] Learning Algorithm in Two-Stage Selective Prediction
    Ye, Weicheng
    Chen, Dangxing
    Ramazanli, Ilqar
    2022 ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING (CACML 2022), 2022, : 512 - 521
  • [25] A Novel Ensemble Machine Learning Model for Oil Production Prediction with Two-Stage Data Preprocessing
    Fan, Zhe
    Liu, Xiusen
    Wang, Zuoqian
    Liu, Pengcheng
    Wang, Yanwei
    PROCESSES, 2024, 12 (03)
  • [26] A novel deep learning ensemble model based on two-stage feature selection and intelligent optimization for water quality prediction
    Liu, Wenli
    Liu, Tianxiang
    Liu, Zihan
    Luo, Hanbin
    Pei, Hanmin
    ENVIRONMENTAL RESEARCH, 2023, 224
  • [27] HETEROGENEOUS IMAGE CHANGE DETECTION BASED ON TWO-STAGE JOINT FEATURE LEARNING
    Han, Te
    Tang, Yuqi
    Chen, Yuzeng
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 3215 - 3218
  • [28] A two-stage credit risk scoring method with stacked-generalisation ensemble learning in peer-to-peer lending
    Wang, Chongren
    Liu, Qigang
    Li, Shuping
    INTERNATIONAL JOURNAL OF EMBEDDED SYSTEMS, 2022, 15 (02) : 158 - 166
  • [29] A hybrid two-stage financial stock forecasting algorithm based on clustering and ensemble learning
    Xu, Ying
    Yan, Cuijuan
    Peng, Shaoliang
    Nojima, Yusuke
    APPLIED INTELLIGENCE, 2020, 50 (11) : 3852 - 3867
  • [30] A Two-Stage Prediction Model in Breast Cancer Using Machine Learning Methods
    Wang, Jishuai
    Gu, De
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 124 : 43 - 44