Machine learning-based models for the prediction of breast cancer recurrence risk

被引:9
|
作者
Zuo, Duo [1 ,2 ,3 ,4 ,5 ]
Yang, Lexin [1 ,2 ,3 ,4 ,5 ]
Jin, Yu [1 ,6 ]
Qi, Huan [7 ]
Liu, Yahui [1 ,2 ,3 ,4 ,5 ]
Ren, Li [1 ,2 ,3 ,4 ,5 ]
机构
[1] Tianjin Med Univ, Dept Clin Lab, Canc Inst & Hosp, Tianjin 300060, Peoples R China
[2] Natl Clin Res Ctr Canc, Tianjin 300060, Peoples R China
[3] Tianjins Clin Res Ctr Canc, Tianjin 300060, Peoples R China
[4] Key Lab Canc Prevent & Therapy, Tianjin 300060, Peoples R China
[5] Tianjin Med Univ, Key Lab Breast Canc Prevent & Therapy, Minist Educ, Tianjin 300060, Peoples R China
[6] Tongji Univ, Canc Ctr, Shanghai Peoples Hosp 10, Sch Med, Shanghai 200072, Peoples R China
[7] China Mobile Grp Tianjin Co Ltd, Tianjin 300130, Peoples R China
关键词
Breast cancer; Machine learning; Artificial intelligence; Disease recurrence; Prediction model; PLASMA-FIBRINOGEN LEVEL; ARTIFICIAL-INTELLIGENCE; HEALTH-CARE; FOLLOW-UP; SURVIVAL; OVARIAN; CA125; CLASSIFICATION; PROGNOSIS; INDICATOR;
D O I
10.1186/s12911-023-02377-z
中图分类号
R-058 [];
学科分类号
摘要
Breast cancer is the most common malignancy diagnosed in women worldwide. The prevalence and incidence of breast cancer is increasing every year; therefore, early diagnosis along with suitable relapse detection is an important strategy for prognosis improvement. This study aimed to compare different machine algorithms to select the best model for predicting breast cancer recurrence. The prediction model was developed by using eleven different machine learning (ML) algorithms, including logistic regression (LR), random forest (RF), support vector classification (SVC), extreme gradient boosting (XGBoost), gradient boosting decision tree (GBDT), decision tree, multilayer perceptron (MLP), linear discriminant analysis (LDA), adaptive boosting (AdaBoost), Gaussian naive Bayes (GaussianNB), and light gradient boosting machine (LightGBM), to predict breast cancer recurrence. The area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and F1 score were used to evaluate the performance of the prognostic model. Based on performance, the optimal ML was selected, and feature importance was ranked by Shapley Additive Explanation (SHAP) values. Compared to the other 10 algorithms, the results showed that the AdaBoost algorithm had the best prediction performance for successfully predicting breast cancer recurrence and was adopted in the establishment of the prediction model. Moreover, CA125, CEA, Fbg, and tumor diameter were found to be the most important features in our dataset to predict breast cancer recurrence. More importantly, our study is the first to use the SHAP method to improve the interpretability of clinicians to predict the recurrence model of breast cancer based on the AdaBoost algorithm. The AdaBoost algorithm offers a clinical decision support model and successfully identifies the recurrence of breast cancer.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Breast Cancer Prediction Using Soft Voting Classifier Based on Machine Learning Models
    Hashim, Mohammed S.
    Yassin, Ali A.
    IAENG International Journal of Computer Science, 2023, 50 (02)
  • [42] Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models
    Ming, Chang
    Viassolo, Valeria
    Probst-Hensch, Nicole
    Chappuis, Pierre O.
    Dinov, Ivo D.
    Katapodi, Maria C.
    BREAST CANCER RESEARCH, 2019, 21 (1)
  • [43] Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models
    Chang Ming
    Valeria Viassolo
    Nicole Probst-Hensch
    Pierre O. Chappuis
    Ivo D. Dinov
    Maria C. Katapodi
    Breast Cancer Research, 21
  • [44] Machine Learning-Based Models for Accident Prediction at a Korean Container Port
    Kim, Jae Hun
    Kim, Juyeon
    Lee, Gunwoo
    Park, Juneyoung
    SUSTAINABILITY, 2021, 13 (16)
  • [45] Machine Learning-based traffic prediction models for Intelligent Transportation Systems
    Boukerche, Azzedine
    Wang, Jiahao
    COMPUTER NETWORKS, 2020, 181
  • [46] Machine Learning-based Software Quality Prediction Models: State of the Art
    Al-Jamimi, Hamdi A.
    Ahmed, Moataz
    2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND APPLICATIONS (ICISA 2013), 2013,
  • [47] Recent Advances in Machine Learning-Based Models for Prediction of Antiviral Peptides
    Farman Ali
    Harish Kumar
    Wajdi Alghamdi
    Faris A. Kateb
    Fawaz Khaled Alarfaj
    Archives of Computational Methods in Engineering, 2023, 30 : 4033 - 4044
  • [48] Machine Learning-Based Prediction Models for Control Traffic in SDN Systems
    Yoo, Yeonho
    Yang, Gyeongsik
    Shin, Changyong
    Lee, Junseok
    Yoo, Chuck
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2023, 16 (06) : 4389 - 4403
  • [49] A Machine Learning-based Framework for Building Application Failure Prediction Models
    Pellegrini, Alessandro
    Di Sanzo, Pierangelo
    Avresky, Dimiter R.
    2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, 2015, : 1072 - 1081
  • [50] Machine Learning-Based Models for Shear Strength Prediction of UHPFRC Beams
    Ni, Xiangyong
    Duan, Kangkang
    MATHEMATICS, 2022, 10 (16)