Machine Learning Improves Prediction Over Logistic Regression on Resected Colon Cancer Patients

被引:19
|
作者
Leonard, Grey [1 ]
South, Charles [2 ]
Balentine, Courtney [1 ,3 ,4 ]
Porembka, Matthew [1 ]
Mansour, John [1 ]
Wang, Sam [1 ]
Yopp, Adam [1 ]
Polanco, Patricio [1 ]
Zeh, Herbert [1 ]
Augustine, Mathew [1 ,3 ]
机构
[1] Univ Texas Southwestern Med Ctr Dallas, Dept Surg, Dallas, TX 75390 USA
[2] Southern Methodist Univ, Dept Stat Sci, Dallas, TX USA
[3] VA North Texas Healthcare Syst, Dallas, TX USA
[4] UTSW Surg Ctr Outcomes Implementat & Novel Interv, Dallas, TX USA
关键词
Colon cancer; Prediction; Machine learning; Outcomes; Risk; READMISSION; COMPLICATIONS; MODEL; RISK; MORTALITY; COLECTOMY; ADULTS; COST;
D O I
10.1016/j.jss.2022.01.012
中图分类号
R61 [外科手术学];
学科分类号
摘要
Introduction: Despite advances, readmission and mortality rates for surgical patients with colon cancer remain high. Prediction models using regression techniques allows for risk stratification to aid periprocedural care. Technological advances have enabled large data to be analyzed using machine learning (ML) algorithms. A national database of colon cancer patients was selected to determine whether ML methods better predict outcomes following surgery compared to conventional methods. Methods: Surgical colon cancer patients were identified using the 2013 National Cancer Database (NCDB). The negative outcome was defined as a composite of 30-d unplanned readmission and 30-and 90-d mortality. ML models, including Random Forest and XGBoost, were built and compared with conventional logistic regression. For the ac-counting of unbalanced outcomes, a synthetic minority oversampling technique (SMOTE) was implemented and applied using XGBoost. Results: Analysis included 528,060 patients. The negative outcome occurred in 11.6% of patients. Model building utilized 30 variables. The primary metric for model comparison was area under the curve (AUC). In comparison to logistic regression (AUC 0.730, 95% CI: 0.725-0.735), AUC's for ML algorithms ranged between 0.748 and 0.757, with the Random Forest model (AUC 0.757, 95% CI: 0.752-0.762) outperforming XGBoost (AUC 0.756, 95% CI: 0.751-0.761) and XGBoost using SMOTE data (AUC 0.748, 95% CI: 0.743-0.753). Conclusions: We show that a large registry of surgical colon cancer patients can be utilized to build ML models to improve outcome prediction with differential discriminative ability. These results reveal the potential of these methods to enhance risk prediction, leading to improved strategies to mitigate those risks. (c) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:181 / 193
页数:13
相关论文
共 50 条
  • [41] Machine Learning and Risk Assessment: Random Forest Does Not Outperform Logistic Regression in the Prediction of Sexual Recidivism
    Etzler, Sonja
    Schonbrodt, Felix D.
    Pargent, Florian
    Eher, Reinhard
    Rettenberger, Martin
    ASSESSMENT, 2024, 31 (02) : 460 - 481
  • [42] Comparing logistic regression and machine learning for obesity risk prediction: A systematic review and meta-analysis
    Boakye, Nancy Fosua
    O'Toole, Ciaran Courtney
    Jalali, Amirhossein
    Hannigan, Ailish
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2025, 199
  • [43] Optimizing thyroid AUS nodules malignancy prediction: a comprehensive study of logistic regression and machine learning models
    Cao, Yuan
    Yang, Yixian
    Chen, Yunchao
    Luan, Mengqi
    Hu, Yan
    Zhang, Lu
    Zhan, Weiwei
    Zhou, Wei
    FRONTIERS IN ENDOCRINOLOGY, 2024, 15
  • [44] Comparison of logistic regression with machine learning methods for the prediction of fetal growth abnormalities: a retrospective cohort study
    Kuhle, Stefan
    Maguire, Bryan
    Zhang, Hongqun
    Hamilton, David
    Allen, Alexander C.
    Joseph, K. S.
    Allen, Victoria M.
    BMC PREGNANCY AND CHILDBIRTH, 2018, 18
  • [45] Fraud Prediction in Smart Societies Using Logistic Regression and k-fold Machine Learning Techniques
    Kamta Nath Mishra
    Subhash Chandra Pandey
    Wireless Personal Communications, 2021, 119 : 1341 - 1367
  • [46] Risk prediction for malignant intraductal papillary mucinous neoplasm of the pancreas: logistic regression versus machine learning
    Kang, Jae Seung
    Lee, Chanhee
    Song, Wookyeong
    Choo, Wonho
    Lee, Seungyeoun
    Lee, Sungyoung
    Han, Youngmin
    Bassi, Claudio
    Salvia, Roberto
    Marchegiani, Giovanni
    Wolfgang, Cristopher L.
    He, Jin
    Blair, Alex B.
    Kluger, Michael D.
    Su, Gloria H.
    Kim, Song Cheol
    Song, Ki-Byung
    Yamamoto, Masakazu
    Higuchi, Ryota
    Hatori, Takashi
    Yang, Ching-Yao
    Yamaue, Hiroki
    Hirono, Seiko
    Satoi, Sohei
    Fujii, Tsutomu
    Hirano, Satoshi
    Lou, Wenhui
    Hashimoto, Yasushi
    Shimizu, Yasuhiro
    Del Chiaro, Marco
    Valente, Roberto
    Lohr, Matthias
    Choi, Dong Wook
    Choi, Seong Ho
    Heo, Jin Seok
    Motoi, Fuyuhiko
    Matsumoto, Ippei
    Lee, Woo Jung
    Kang, Chang Moo
    Shyr, Yi-Ming
    Wang, Shin-E
    Han, Ho-Seong
    Yoon, Yoo-Seok
    Besselink, Marc G.
    van Huijgevoort, Nadine C. M.
    Sho, Masayuki
    Nagano, Hiroaki
    Kim, Sang Geol
    Honda, Goro
    Yang, Yinmo
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [47] Fraud Prediction in Smart Societies Using Logistic Regression and k-fold Machine Learning Techniques
    Mishra, Kamta Nath
    Pandey, Subhash Chandra
    WIRELESS PERSONAL COMMUNICATIONS, 2021, 119 (02) : 1341 - 1367
  • [48] Letter to the editor. Comment on: Machine learning versus logistic regression for the prediction of complications after pancreatoduodenectomy
    Ferrara, Eduardo Alcobilla
    SURGERY, 2024, 175 (05) : 1462 - 1462
  • [49] Risk prediction for malignant intraductal papillary mucinous neoplasm of the pancreas: logistic regression versus machine learning
    Jae Seung Kang
    Chanhee Lee
    Wookyeong Song
    Wonho Choo
    Seungyeoun Lee
    Sungyoung Lee
    Youngmin Han
    Claudio Bassi
    Roberto Salvia
    Giovanni Marchegiani
    Cristopher L. Wolfgang
    Jin He
    Alex B. Blair
    Michael D. Kluger
    Gloria H. Su
    Song Cheol Kim
    Ki-Byung Song
    Masakazu Yamamoto
    Ryota Higuchi
    Takashi Hatori
    Ching-Yao Yang
    Hiroki Yamaue
    Seiko Hirono
    Sohei Satoi
    Tsutomu Fujii
    Satoshi Hirano
    Wenhui Lou
    Yasushi Hashimoto
    Yasuhiro Shimizu
    Marco Del Chiaro
    Roberto Valente
    Matthias Lohr
    Dong Wook Choi
    Seong Ho Choi
    Jin Seok Heo
    Fuyuhiko Motoi
    Ippei Matsumoto
    Woo Jung Lee
    Chang Moo Kang
    Yi-Ming Shyr
    Shin-E. Wang
    Ho-Seong Han
    Yoo-Seok Yoon
    Marc G. Besselink
    Nadine C. M. van Huijgevoort
    Masayuki Sho
    Hiroaki Nagano
    Sang Geol Kim
    Goro Honda
    Yinmo Yang
    Scientific Reports, 10
  • [50] Machine learning-based multimodal prediction of prognosis in patients with resected intrahepatic cholangiocarcinoma
    Schmauch, Benoit
    Brion, Eliott
    Ducret, Valerie
    Nasar, Naaz
    McIntyre, Sarah
    Sin-Chan, Patrick
    Maussion, Charles
    Jarnagin, William R.
    Chakraborty, Jayasree
    JOURNAL OF CLINICAL ONCOLOGY, 2023, 41 (16)