Survival prediction in second primary breast cancer patients with machine learning: An analysis of SEER database

被引:0
|
作者
Wu, Yafei [1 ,2 ,4 ]
Zhang, Yaheng [1 ,2 ]
Duan, Siyu [1 ,2 ]
Gu, Chenming [1 ,2 ]
Wei, Chongtao [1 ,2 ]
Fang, Ya [1 ,2 ,3 ]
机构
[1] Xiamen Univ, Sch Publ Hlth, Xiangan South Rd, Xiamen 361102, Fujian, Peoples R China
[2] Key Lab Hlth Technol Assessment Fujian Prov, Xiamen, Fujian, Peoples R China
[3] Xiamen Univ, Natl Inst Data Sci Hlth & Med, Xiamen, Fujian, Peoples R China
[4] Hong Kong Polytech Univ, Fac Hlth & Social Sci, Sch Nursing, Hong Kong, Peoples R China
关键词
Second primary breast cancer; Machine learning; Prognostic model; SEER; RISK; MALIGNANCIES; CARCINOMA; DIAGNOSIS; MODEL;
D O I
10.1016/j.cmpb.2024.108310
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background: Studies have found that first primary cancer (FPC) survivors are at high risk of developing second primary breast cancer (SPBC). However, there is a lack of prognostic studies specifically focusing on patients with SPBC. Methods: This retrospective study used data from Surveillance, Epidemiology and End Results Program. We selected female FPC survivors diagnosed with SPBC from 12 registries (from January 1998 to December 2018) to construct prognostic models. Meanwhile, SPBC patients selected from another five registries (from January 2010 to December 2018) were used as the validation set to test the model's generalization ability. Four machine learning models and a Cox proportional hazards regression (CoxPH) were constructed to predict the overall survival of SPBC patients. Univariate and multivariate Cox regression analyses were used for feature selection. Model performance was assessed using time-dependent area under the ROC curve (t-AUC) and integrated Brier score (iBrier). Results: A total of 10,321 female FPC survivors with SPBC (mean age [SD]: 66.03 [11.17]) were included for model construction. These patients were randomly split into a training set (mean age [SD]: 65.98 [11.15]) and a test set (mean age [SD]: 66.15 [11.23]) with a ratio of 7:3. In validation set, a total of 3,638 SPBC patients (mean age [SD]: 66.28 [10.68]) were finally enrolled. Sixteen features were selected for model construction through univariate and multivariable Cox regression analyses. Among five models, random survival forest model showed excellent performance with a t-AUC of 0.805 (95 %CI: 0.803- 0.807) and an iBrier of 0.123 (95 %CI: 0.122- 0.124) on testing set, as well as a t-AUC of 0.803 (95 %CI: 0.801- 0.807) and an iBrier of 0.098 (95 %CI: 0.096- 0.103) on validation set. Through feature importance ranking, the top one and other top five key predictive features of the random survival forest model were identified, namely age, stage, regional nodes positive, latency, radiotherapy, and surgery. Conclusions: The random survival forest model outperformed CoxPH and other machine learning models in predicting the overall survival of patients with SPBC, which was helpful for the monitoring of high-risk populations.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Machine learning applied to SEER data for prediction of breast cancer specific survival
    Le, Phuong
    De Benedetti, Marc
    Le, Hoa V.
    Truong, Chi T. L.
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2021, 30 : 268 - 269
  • [2] Survival of Patients with First and Metachronous Second Primary Breast Cancer or Lung Cancer Malignancy: Comparisons Using the SEER Database
    Zhong, Miaochun
    He, Xianghong
    Lei, Kefeng
    [J]. ADVANCES IN THERAPY, 2020, 37 (05) : 2236 - 2245
  • [3] Survival of Patients with First and Metachronous Second Primary Breast Cancer or Lung Cancer Malignancy: Comparisons Using the SEER Database
    Miaochun Zhong
    Xianghong He
    Kefeng Lei
    [J]. Advances in Therapy, 2020, 37 : 2236 - 2245
  • [4] Second primary male breast cancer - A SEER database review
    Wernberg, J. A.
    Agrawal, S.
    Meguerditchian, A. N.
    Yap, J.
    Kulkarni, S.
    [J]. ANNALS OF SURGICAL ONCOLOGY, 2007, 14 (02) : 82 - 82
  • [5] Influence factors of the survival in colorectal cancer patients with second primary malignancy after surgery: A SEER database analysis
    Liu, Liyu
    Chen, Bolin
    [J]. MEDICINE, 2023, 102 (40) : E35286
  • [6] Effect of breast cancer as the first or second primary cancer on the prognosis of women with thyroid cancer: a SEER database analysis
    Huang, Jianglong
    Huang, Yihui
    Zhou, Ling
    Chen, Sichao
    Chen, Danyang
    Wei, Wei
    Zhang, Chao
    Wang, Min
    Zhou, Wei
    Zeng, Wen
    Liu, Zeming
    Guo, Liang
    [J]. TRANSLATIONAL CANCER RESEARCH, 2020, 9 (11) : 6955 - 6962
  • [7] Geographic variation in conditional survival for breast cancer patients: a SEER database analysis.
    Wang, S. J.
    Luh, J. Y.
    Fuller, C. D.
    Thomas, C. R., Jr.
    [J]. BREAST CANCER RESEARCH AND TREATMENT, 2006, 100 : S275 - S275
  • [8] Analysis and prediction of second primary malignancy in patients with breast cancer
    Long, Quanyi
    Zhao, Feilong
    Li, Hongjiang
    [J]. MOLECULAR AND CLINICAL ONCOLOGY, 2022, 17 (06)
  • [9] Demographics and survival in male breast cancer: An updated analysis of SEER database
    Khoury, John
    Anusim, Nwabundo
    Macari, David
    Jaiyesimi, Ishmael
    [J]. CANCER RESEARCH, 2020, 80 (04)
  • [10] Second Primary Neoplasms in Patients With Uveal Melanoma: A SEER Database Analysis
    Lains, Ines
    Bartosch, Carla
    Mondim, Vera
    Healy, Brian
    Kim, Ivana K.
    Husain, Deeba
    Miller, Joan W.
    [J]. AMERICAN JOURNAL OF OPHTHALMOLOGY, 2016, 165 : 54 - 64