Survival prediction in second primary breast cancer patients with machine learning: An analysis of SEER database

被引:0
|
作者
Wu, Yafei [1 ,2 ,4 ]
Zhang, Yaheng [1 ,2 ]
Duan, Siyu [1 ,2 ]
Gu, Chenming [1 ,2 ]
Wei, Chongtao [1 ,2 ]
Fang, Ya [1 ,2 ,3 ]
机构
[1] Xiamen Univ, Sch Publ Hlth, Xiangan South Rd, Xiamen 361102, Fujian, Peoples R China
[2] Key Lab Hlth Technol Assessment Fujian Prov, Xiamen, Fujian, Peoples R China
[3] Xiamen Univ, Natl Inst Data Sci Hlth & Med, Xiamen, Fujian, Peoples R China
[4] Hong Kong Polytech Univ, Fac Hlth & Social Sci, Sch Nursing, Hong Kong, Peoples R China
关键词
Second primary breast cancer; Machine learning; Prognostic model; SEER; RISK; MALIGNANCIES; CARCINOMA; DIAGNOSIS; MODEL;
D O I
10.1016/j.cmpb.2024.108310
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background: Studies have found that first primary cancer (FPC) survivors are at high risk of developing second primary breast cancer (SPBC). However, there is a lack of prognostic studies specifically focusing on patients with SPBC. Methods: This retrospective study used data from Surveillance, Epidemiology and End Results Program. We selected female FPC survivors diagnosed with SPBC from 12 registries (from January 1998 to December 2018) to construct prognostic models. Meanwhile, SPBC patients selected from another five registries (from January 2010 to December 2018) were used as the validation set to test the model's generalization ability. Four machine learning models and a Cox proportional hazards regression (CoxPH) were constructed to predict the overall survival of SPBC patients. Univariate and multivariate Cox regression analyses were used for feature selection. Model performance was assessed using time-dependent area under the ROC curve (t-AUC) and integrated Brier score (iBrier). Results: A total of 10,321 female FPC survivors with SPBC (mean age [SD]: 66.03 [11.17]) were included for model construction. These patients were randomly split into a training set (mean age [SD]: 65.98 [11.15]) and a test set (mean age [SD]: 66.15 [11.23]) with a ratio of 7:3. In validation set, a total of 3,638 SPBC patients (mean age [SD]: 66.28 [10.68]) were finally enrolled. Sixteen features were selected for model construction through univariate and multivariable Cox regression analyses. Among five models, random survival forest model showed excellent performance with a t-AUC of 0.805 (95 %CI: 0.803- 0.807) and an iBrier of 0.123 (95 %CI: 0.122- 0.124) on testing set, as well as a t-AUC of 0.803 (95 %CI: 0.801- 0.807) and an iBrier of 0.098 (95 %CI: 0.096- 0.103) on validation set. Through feature importance ranking, the top one and other top five key predictive features of the random survival forest model were identified, namely age, stage, regional nodes positive, latency, radiotherapy, and surgery. Conclusions: The random survival forest model outperformed CoxPH and other machine learning models in predicting the overall survival of patients with SPBC, which was helpful for the monitoring of high-risk populations.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Conditional survival of patients with esophageal cancer: A SEER database analysis.
    Muzaffar, M.
    Khuder, S.
    Mohamed, I.
    JOURNAL OF CLINICAL ONCOLOGY, 2011, 29 (15)
  • [32] Development and Validation of a Nomogram for Prognosis Prediction in Patients with Synchronous Primary Thyroid and Breast Cancer Based on SEER Database
    Huo, Miao
    Zhang, Jianfei
    Hou, Minna
    Li, Jianhui
    Bai, Ning
    Xu, Ruifen
    Guo, Jiao
    CANCER INVESTIGATION, 2024, 42 (03) : 212 - 225
  • [33] Trend in primary tumor resection and disease specific survival in patients with metastatic breast cancer: A SEER database analysis (1988-2011)
    Muzaffar, Mahvish
    Kachare, Swapnil D.
    Fitzgerald, Timothy L.
    Wong, Jan H.
    Verbanac, Kathryn
    Vohra, Nasreen A.
    CANCER RESEARCH, 2015, 75
  • [34] Machine Learning in Prediction of Second Primary Cancer and Recurrence in Colorectal Cancer
    Ting, Wen-Chien
    Lu, Yen-Chiao Angel
    Ho, Wei-Chi
    Cheewakriangkrai, Chalong
    Chang, Horng-Rong
    Lin, Chia-Ling
    INTERNATIONAL JOURNAL OF MEDICAL SCIENCES, 2020, 17 (03): : 280 - 291
  • [35] Advanced Machine Learning in Prediction of Second Primary Cancer in Colorectal Cancer
    Chang, Chi-Chang
    Chen, Ying-Chen
    DIGITAL PERSONALIZED HEALTH AND MEDICINE, 2020, 270 : 1191 - 1192
  • [36] Survival analysis of breast cancer patients using machine learning models
    Evangeline, I. Keren
    Kirubha, S. P. Angeline
    Precious, J. Glory
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (20) : 30909 - 30928
  • [37] Survival analysis of breast cancer patients using machine learning models
    Keren Evangeline I.
    S. P. Angeline Kirubha
    J. Glory Precious
    Multimedia Tools and Applications, 2023, 82 : 30909 - 30928
  • [38] Second Primary Differentiated Thyroid Carcinoma in Adult Cancer Survivors: A SEER Database Analysis
    Feng, Jianhua
    Wu, Caixiu
    Shen, Fei
    Cai, Wensong
    Xu, Bo
    JOURNAL OF CLINICAL ENDOCRINOLOGY & METABOLISM, 2024,
  • [39] A comparison of machine learning techniques for survival prediction in breast cancer
    Leonardo Vanneschi
    Antonella Farinaccio
    Giancarlo Mauri
    Marco Antoniotti
    Paolo Provero
    Mario Giacobini
    BioData Mining, 4
  • [40] Machine Learning Techniques for Survival Time Prediction in Breast Cancer
    Mihaylov, Iliyan
    Nisheva, Maria
    Vassilev, Dimitar
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, AIMSA 2018, 2018, 11089 : 186 - 194