Robust Machine Learning for Colorectal Cancer Risk Prediction and Stratification

被引:33
|
作者
Nartowt, Bradley J. [1 ]
Hart, Gregory R. [1 ]
Muhammad, Wazir [1 ]
Liang, Ying [2 ]
Stark, Gigi F. [3 ]
Deng, Jun [1 ]
机构
[1] Yale Univ, Dept Therapeut Radiol, New Haven, CT 06510 USA
[2] Medial Coll Wisconsin, Dept Radiat Oncol, Milwaukee, WI USA
[3] Yale Univ, Dept Stat & Data Sci, New Haven, CT USA
来源
FRONTIERS IN BIG DATA | 2020年 / 3卷
基金
美国国家卫生研究院;
关键词
colorectal cancer; risk stratification; neural network; concordance; self-reportable health data; external validation;
D O I
10.3389/fdata.2020.00006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
While colorectal cancer (CRC) is third in prevalence and mortality among cancers in the United States, there is no effective method to screen the general public for CRC risk. In this study, to identify an effective mass screening method for CRC risk, we evaluated seven supervised machine learning algorithms: linear discriminant analysis, support vector machine, naive Bayes, decision tree, random forest, logistic regression, and artificial neural network. Models were trained and cross-tested with the National Health Interview Survey (NHIS) and the Prostate, Lung, Colorectal, Ovarian Cancer Screening (PLCO) datasets. Six imputation methods were used to handle missing data: mean, Gaussian, Lorentzian, one-hot encoding, Gaussian expectation-maximization, and listwise deletion. Among all of the model configurations and imputation method combinations, the artificial neural network with expectation-maximization imputation emerged as the best, having a concordance of 0.70 +/- 0.02, sensitivity of 0.63 +/- 0.06, and specificity of 0.82 +/- 0.04. In stratifying CRC risk in the NHIS and PLCO datasets, only 2% of negative cases were misclassified as high risk and 6% of positive cases were misclassified as low risk. In modeling the CRC-free probability with Kaplan-Meier estimators, low-, medium-, and high CRC-risk groups have statistically-significant separation. Our results indicated that the trained artificial neural network can be used as an effective screening tool for early intervention and prevention of CRC in large populations.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Building Robust Machine Learning Models for Colorectal Cancer Risk Prediction
    Nartowt, B.
    Hart, G.
    Muhammad, W.
    Liang, Y.
    Deng, J.
    [J]. MEDICAL PHYSICS, 2019, 46 (06) : E324 - E324
  • [2] Machine Learning for Colorectal Cancer Risk Prediction
    Zheng, Ling
    Eniola, Elijah
    Wang, Jiacun
    [J]. 2021 INTERNATIONAL CONFERENCE ON CYBER-PHYSICAL SOCIAL INTELLIGENCE (ICCSI), 2021,
  • [3] A Model of Risk of Colorectal Cancer Tested between Studies: Building Robust Machine Learning Models for Colorectal Cancer Risk Prediction
    Nartowt, B.
    Hart, G. R.
    Muhammad, W.
    Liang, Y.
    Deng, J.
    [J]. INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2019, 105 (01): : E132 - E132
  • [4] Radiomic-based machine learning model for the accurate prediction of prostate cancer risk stratification
    Shu, Xin
    Liu, Yunfan
    Qiao, Xiaofeng
    Ai, Guangyong
    Liu, Li
    Liao, Jun
    Deng, Zhengqiao
    He, Xiaojing
    [J]. BRITISH JOURNAL OF RADIOLOGY, 2023, 96 (1143):
  • [5] Machine Learning in Colorectal Cancer Risk Prediction from Routinely Collected Data: A Review
    Burnett, Bruce
    Zhou, Shang-Ming
    Brophy, Sinead
    Davies, Phil
    Ellis, Paul
    Kennedy, Jonathan
    Bandyopadhyay, Amrita
    Parker, Michael
    Lyons, Ronan A.
    [J]. DIAGNOSTICS, 2023, 13 (02)
  • [6] Machine Learning for Prediction and Risk Stratification of Lupus Nephritis Renal Flare
    Chen, Yinghua
    Huang, Siwan
    Chen, Tiange
    Liang, Dandan
    Yang, Jing
    Zeng, Caihong
    Li, Xiang
    Xie, Guotong
    Liu, ZhiHong
    [J]. AMERICAN JOURNAL OF NEPHROLOGY, 2021, 52 (02) : 152 - 160
  • [7] Machine Learning in Prediction of Second Primary Cancer and Recurrence in Colorectal Cancer
    Ting, Wen-Chien
    Lu, Yen-Chiao Angel
    Ho, Wei-Chi
    Cheewakriangkrai, Chalong
    Chang, Horng-Rong
    Lin, Chia-Ling
    [J]. INTERNATIONAL JOURNAL OF MEDICAL SCIENCES, 2020, 17 (03): : 280 - 291
  • [8] Advanced Machine Learning in Prediction of Second Primary Cancer in Colorectal Cancer
    Chang, Chi-Chang
    Chen, Ying-Chen
    [J]. DIGITAL PERSONALIZED HEALTH AND MEDICINE, 2020, 270 : 1191 - 1192
  • [9] Personalized Colorectal Cancer Survivability Prediction with Machine Learning Methods
    Li, Samuel
    Razzaghi, Talayeh
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 2554 - 2558
  • [10] An interpretable machine learning prognostic system for risk stratification in oropharyngeal cancer
    Alabi, Rasheed Omobolaji
    Almangush, Alhadi
    Elmusrati, Mohammed
    Leivo, Ilmo
    Makitie, Antti A.
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2022, 168