Survival prediction in second primary breast cancer patients with machine learning: An analysis of SEER database

被引:0
|
作者
Wu, Yafei [1 ,2 ,4 ]
Zhang, Yaheng [1 ,2 ]
Duan, Siyu [1 ,2 ]
Gu, Chenming [1 ,2 ]
Wei, Chongtao [1 ,2 ]
Fang, Ya [1 ,2 ,3 ]
机构
[1] Xiamen Univ, Sch Publ Hlth, Xiangan South Rd, Xiamen 361102, Fujian, Peoples R China
[2] Key Lab Hlth Technol Assessment Fujian Prov, Xiamen, Fujian, Peoples R China
[3] Xiamen Univ, Natl Inst Data Sci Hlth & Med, Xiamen, Fujian, Peoples R China
[4] Hong Kong Polytech Univ, Fac Hlth & Social Sci, Sch Nursing, Hong Kong, Peoples R China
关键词
Second primary breast cancer; Machine learning; Prognostic model; SEER; RISK; MALIGNANCIES; CARCINOMA; DIAGNOSIS; MODEL;
D O I
10.1016/j.cmpb.2024.108310
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background: Studies have found that first primary cancer (FPC) survivors are at high risk of developing second primary breast cancer (SPBC). However, there is a lack of prognostic studies specifically focusing on patients with SPBC. Methods: This retrospective study used data from Surveillance, Epidemiology and End Results Program. We selected female FPC survivors diagnosed with SPBC from 12 registries (from January 1998 to December 2018) to construct prognostic models. Meanwhile, SPBC patients selected from another five registries (from January 2010 to December 2018) were used as the validation set to test the model's generalization ability. Four machine learning models and a Cox proportional hazards regression (CoxPH) were constructed to predict the overall survival of SPBC patients. Univariate and multivariate Cox regression analyses were used for feature selection. Model performance was assessed using time-dependent area under the ROC curve (t-AUC) and integrated Brier score (iBrier). Results: A total of 10,321 female FPC survivors with SPBC (mean age [SD]: 66.03 [11.17]) were included for model construction. These patients were randomly split into a training set (mean age [SD]: 65.98 [11.15]) and a test set (mean age [SD]: 66.15 [11.23]) with a ratio of 7:3. In validation set, a total of 3,638 SPBC patients (mean age [SD]: 66.28 [10.68]) were finally enrolled. Sixteen features were selected for model construction through univariate and multivariable Cox regression analyses. Among five models, random survival forest model showed excellent performance with a t-AUC of 0.805 (95 %CI: 0.803- 0.807) and an iBrier of 0.123 (95 %CI: 0.122- 0.124) on testing set, as well as a t-AUC of 0.803 (95 %CI: 0.801- 0.807) and an iBrier of 0.098 (95 %CI: 0.096- 0.103) on validation set. Through feature importance ranking, the top one and other top five key predictive features of the random survival forest model were identified, namely age, stage, regional nodes positive, latency, radiotherapy, and surgery. Conclusions: The random survival forest model outperformed CoxPH and other machine learning models in predicting the overall survival of patients with SPBC, which was helpful for the monitoring of high-risk populations.
引用
下载
收藏
页数:8
相关论文
共 50 条
  • [41] A comparison of machine learning techniques for survival prediction in breast cancer
    Vanneschi, Leonardo
    Farinaccio, Antonella
    Mauri, Giancarlo
    Antoniotti, Mauro
    Provero, Paolo
    Giacobini, Mario
    BIODATA MINING, 2011, 4
  • [42] Clinicopathological features, survival and risk in breast cancer survivors with thyroid cancer: an analysis of the SEER database
    Shuting Li
    Jiao Yang
    Yanwei Shen
    Xiaoai Zhao
    Lingxiao Zhang
    Biyuan Wang
    Pan Li
    Yunmei Wang
    Min Yi
    Jin Yang
    BMC Public Health, 19
  • [43] Clinicopathological features, survival and risk in breast cancer survivors with thyroid cancer: an analysis of the SEER database
    Li, Shuting
    Yang, Jiao
    Shen, Yanwei
    Zhao, Xiaoai
    Zhang, Lingxiao
    Wang, Biyuan
    Li, Pan
    Wang, Yunmei
    Yi, Min
    Yang, Jin
    BMC PUBLIC HEALTH, 2019, 19 (01)
  • [44] Nomograms constructed for predicting diagnosis and prognosis in cervical cancer patients with second primary malignancies: a SEER database analysis
    Xie, Ning
    Lin, Jie
    Liu, Linying
    Deng, Sufang
    Yu, Haijuan
    Sun, Yang
    JOURNAL OF CANCER RESEARCH AND CLINICAL ONCOLOGY, 2023, 149 (14) : 13201 - 13210
  • [45] Risk prediction of second primary malignancies in patients after rectal cancer: analysis based on SEER Program
    Yong-Chao Sun
    Zi-Dan Zhao
    Na Yao
    Yu-Wen Jiao
    Jia-Wen Zhang
    Yue Fu
    Wei-Hai Shi
    BMC Gastroenterology, 23
  • [46] Nomograms constructed for predicting diagnosis and prognosis in cervical cancer patients with second primary malignancies: a SEER database analysis
    Ning Xie
    Jie Lin
    Linying Liu
    Sufang Deng
    Haijuan Yu
    Yang Sun
    Journal of Cancer Research and Clinical Oncology, 2023, 149 : 13201 - 13210
  • [47] Analysis of chemotherapy effect on the second primary malignancy for head and neck cancer patients by a nomogram based on SEER database
    Li, Xinrong
    Guo, Kaibo
    Feng, Yuqian
    Guo, Yong
    CANCER MEDICINE, 2020, 9 (21): : 8029 - 8042
  • [48] Risk prediction of second primary malignancies in patients after rectal cancer: analysis based on SEER Program
    Sun, Yong-Chao
    Zhao, Zi-Dan
    Yao, Na
    Jiao, Yu-Wen
    Zhang, Jia-Wen
    Fu, Yue
    Shi, Wei-Hai
    BMC GASTROENTEROLOGY, 2023, 23 (01)
  • [49] Analysis of second primary cancers in gastroinstestinal stromal tumor patients using SEER database
    Ghimire, Krishna Bilas
    Nepal, Barsha
    Shah, Binay Kumar
    JOURNAL OF CLINICAL ONCOLOGY, 2013, 31 (04)
  • [50] Conditional survival in ovarian cancer: A SEER database analysis
    Choi, M.
    Fuller, C. D.
    Thomas, C. R., Jr.
    Wang, S. J.
    JOURNAL OF CLINICAL ONCOLOGY, 2008, 26 (15)