Machine learning-based models for the prediction of breast cancer recurrence risk

被引:9
|
作者
Zuo, Duo [1 ,2 ,3 ,4 ,5 ]
Yang, Lexin [1 ,2 ,3 ,4 ,5 ]
Jin, Yu [1 ,6 ]
Qi, Huan [7 ]
Liu, Yahui [1 ,2 ,3 ,4 ,5 ]
Ren, Li [1 ,2 ,3 ,4 ,5 ]
机构
[1] Tianjin Med Univ, Dept Clin Lab, Canc Inst & Hosp, Tianjin 300060, Peoples R China
[2] Natl Clin Res Ctr Canc, Tianjin 300060, Peoples R China
[3] Tianjins Clin Res Ctr Canc, Tianjin 300060, Peoples R China
[4] Key Lab Canc Prevent & Therapy, Tianjin 300060, Peoples R China
[5] Tianjin Med Univ, Key Lab Breast Canc Prevent & Therapy, Minist Educ, Tianjin 300060, Peoples R China
[6] Tongji Univ, Canc Ctr, Shanghai Peoples Hosp 10, Sch Med, Shanghai 200072, Peoples R China
[7] China Mobile Grp Tianjin Co Ltd, Tianjin 300130, Peoples R China
关键词
Breast cancer; Machine learning; Artificial intelligence; Disease recurrence; Prediction model; PLASMA-FIBRINOGEN LEVEL; ARTIFICIAL-INTELLIGENCE; HEALTH-CARE; FOLLOW-UP; SURVIVAL; OVARIAN; CA125; CLASSIFICATION; PROGNOSIS; INDICATOR;
D O I
10.1186/s12911-023-02377-z
中图分类号
R-058 [];
学科分类号
摘要
Breast cancer is the most common malignancy diagnosed in women worldwide. The prevalence and incidence of breast cancer is increasing every year; therefore, early diagnosis along with suitable relapse detection is an important strategy for prognosis improvement. This study aimed to compare different machine algorithms to select the best model for predicting breast cancer recurrence. The prediction model was developed by using eleven different machine learning (ML) algorithms, including logistic regression (LR), random forest (RF), support vector classification (SVC), extreme gradient boosting (XGBoost), gradient boosting decision tree (GBDT), decision tree, multilayer perceptron (MLP), linear discriminant analysis (LDA), adaptive boosting (AdaBoost), Gaussian naive Bayes (GaussianNB), and light gradient boosting machine (LightGBM), to predict breast cancer recurrence. The area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and F1 score were used to evaluate the performance of the prognostic model. Based on performance, the optimal ML was selected, and feature importance was ranked by Shapley Additive Explanation (SHAP) values. Compared to the other 10 algorithms, the results showed that the AdaBoost algorithm had the best prediction performance for successfully predicting breast cancer recurrence and was adopted in the establishment of the prediction model. Moreover, CA125, CEA, Fbg, and tumor diameter were found to be the most important features in our dataset to predict breast cancer recurrence. More importantly, our study is the first to use the SHAP method to improve the interpretability of clinicians to predict the recurrence model of breast cancer based on the AdaBoost algorithm. The AdaBoost algorithm offers a clinical decision support model and successfully identifies the recurrence of breast cancer.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Machine learning-based models for the prediction of breast cancer recurrence risk
    Duo Zuo
    Lexin Yang
    Yu Jin
    Huan Qi
    Yahui Liu
    Li Ren
    BMC Medical Informatics and Decision Making, 23
  • [2] Evolution of Breast Cancer Recurrence Risk Prediction: A Systematic Review of Statistical and Machine Learning-Based Models
    El Haji, Hasna
    Souadka, Amine
    Patel, Bhavik N.
    Sbihi, Nada
    Ramasamy, Gokul
    Patel, Bhavika K.
    Ghogho, Mounir
    Banerjee, Imon
    JCO CLINICAL CANCER INFORMATICS, 2023, 7 : e2300049
  • [3] Evolution of Breast Cancer Recurrence Risk Prediction: A Systematic Review of Statistical and Machine Learning-Based Models
    El Haji, Hasna
    Souadka, Amine
    Patel, Bhavik N.
    Sbihi, Nada
    Ramasamy, Gokul
    Patel, Bhavika K.
    Ghogho, Mounir
    Banerjee, Imon
    JCO CLINICAL CANCER INFORMATICS, 2023, 7
  • [4] Machine learning-based radiomics models for prediction of locoregional recurrence in patients with breast cancer
    Lee, Joongyo
    Yoo, Sang Kyun
    Kim, Kangpyo
    Lee, Byung Min
    Park, Vivian Youngjean
    Kim, Jin Sung
    Kim, Yong Bae
    ONCOLOGY LETTERS, 2023, 26 (04)
  • [5] An Assessment of the Predictive Performance of Current Machine Learning-Based Breast Cancer Risk Prediction Models: Systematic Review
    Gao, Ying
    Li, Shu
    Jin, Yujing
    Zhou, Lengxiao
    Sun, Shaomei
    Xu, Xiaoqian
    Li, Shuqian
    Yang, Hongxi
    Zhang, Qing
    Wang, Yaogang
    JMIR PUBLIC HEALTH AND SURVEILLANCE, 2022, 8 (12):
  • [6] Machine Learning-Based Models Enhance the Prediction of Prostate Cancer
    Chen, Sunmeng
    Jian, Tengteng
    Chi, Changliang
    Liang, Yi
    Liang, Xiao
    Yu, Ying
    Jiang, Fengming
    Lu, Ji
    FRONTIERS IN ONCOLOGY, 2022, 12
  • [7] Machine learning-based prediction models in neurosurgery
    Habashy, Karl J.
    Arrieta, Victor A.
    Feghali, James
    NEUROSURGICAL FOCUS, 2023, 55 (03)
  • [8] Machine learning-based prediction model for distant metastasis of breast cancer
    Duan, Hao
    Zhang, Yu
    Qiu, Haoye
    Fu, Xiuhao
    Liu, Chunling
    Zang, Xiaofeng
    Xu, Anqi
    Wu, Ziyue
    Li, Xingfeng
    Zhang, Qingchen
    Zhang, Zilong
    Cui, Feifei
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 169
  • [9] Machine learning-based prediction of breast cancer growth rate in vivo
    Shristi Bhattarai
    Sergey Klimov
    Mohammed A. Aleskandarany
    Helen Burrell
    Anthony Wormall
    Andrew R. Green
    Padmashree Rida
    Ian O. Ellis
    Remus M. Osan
    Emad A. Rakha
    Ritu Aneja
    British Journal of Cancer, 2019, 121 : 497 - 504
  • [10] Machine learning-based prediction of breast cancer growth rate in vivo
    Bhattarai, Shristi
    Klimov, Sergey
    Aleskandarany, Mohammed A.
    Burrell, Helen
    Wormall, Anthony
    Green, Andrew R.
    Rida, Padmashree
    Ellis, Ian O.
    Osan, Remus M.
    Rakha, Emad A.
    Aneja, Ritu
    BRITISH JOURNAL OF CANCER, 2019, 121 (06) : 497 - 504