Development and validation of explainable machine-learning models for carotid atherosclerosis early screening

被引:4
|
作者
Yun, Ke [1 ,2 ]
He, Tao [3 ]
Zhen, Shi [4 ]
Quan, Meihui [1 ,2 ]
Yang, Xiaotao [1 ,2 ]
Man, Dongliang [1 ,2 ]
Zhang, Shuang [1 ,2 ]
Wang, Wei [5 ]
Han, Xiaoxu [1 ,2 ,6 ,7 ]
机构
[1] China Med Univ, Affiliated Hosp 1, Natl Clin Res Ctr Lab Med, Shenyang, Liaoning, Peoples R China
[2] China Med Univ, Affiliated Hosp 1, Dept Lab Med, Shenyang, Liaoning, Peoples R China
[3] Neusoft Corp, Neusoft Res Inst, Shenyang, Liaoning, Peoples R China
[4] Northeastern Univ, Dept Software Engn, Shenyang, Liaoning, Peoples R China
[5] China Med Univ, Affiliated Hosp 1, Dept Phys Examinat Ctr, Shenyang, Liaoning, Peoples R China
[6] Chinese Acad Med Sci, Lab Med Innovat Unit, Shenyang, Liaoning, Peoples R China
[7] China Med Univ, Affiliated Hosp 1, NHC Key Lab AIDS Immunol, Shenyang, Liaoning, Peoples R China
关键词
Machine learning; Carotid atherosclerosis; Explainable model; CHINESE ADULTS; RISK-FACTORS; PREVALENCE; ULTRASOUND; BURDEN; AGE; GENDER;
D O I
10.1186/s12967-023-04093-8
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
BackgroundCarotid atherosclerosis (CAS), an important factor in the development of stroke, is a major public health concern. The aim of this study was to establish and validate machine learning (ML) models for early screening of CAS using routine health check-up indicators in northeast China.MethodsA total of 69,601 health check-up records from the health examination center of the First Hospital of China Medical University (Shenyang, China) were collected between 2018 and 2019. For the 2019 records, 80% were assigned to the training set and 20% to the testing set. The 2018 records were used as the external validation dataset. Ten ML algorithms, including decision tree (DT), K-nearest neighbors (KNN), logistic regression (LR), naive Bayes (NB), random forest (RF), multiplayer perceptron (MLP), extreme gradient boosting machine (XGB), gradient boosting decision tree (GBDT), linear support vector machine (SVM-linear), and non-linear support vector machine (SVM-nonlinear), were used to construct CAS screening models. The area under the receiver operating characteristic curve (auROC) and precision-recall curve (auPR) were used as measures of model performance. The SHapley Additive exPlanations (SHAP) method was used to demonstrate the interpretability of the optimal model.ResultsA total of 6315 records of patients undergoing carotid ultrasonography were collected; of these, 1632, 407, and 1141 patients were diagnosed with CAS in the training, internal validation, and external validation datasets, respectively. The GBDT model achieved the highest performance metrics with auROC of 0.860 (95% CI 0.839-0.880) in the internal validation dataset and 0.851 (95% CI 0.837-0.863) in the external validation dataset. Individuals with diabetes or those over 65 years of age showed low negative predictive value. In the interpretability analysis, age was the most important factor influencing the performance of the GBDT model, followed by sex and non-high-density lipoprotein cholesterol.ConclusionsThe ML models developed could provide good performance for CAS identification using routine health check-up indicators and could hopefully be applied in scenarios without ethnic and geographic heterogeneity for CAS prevention.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Development and validation of explainable machine-learning models for carotid atherosclerosis early screening
    Ke Yun
    Tao He
    Shi Zhen
    Meihui Quan
    Xiaotao Yang
    Dongliang Man
    Shuang Zhang
    Wei Wang
    Xiaoxu Han
    Journal of Translational Medicine, 21
  • [2] Machine learning models for screening carotid atherosclerosis in asymptomatic adults
    Yu, Jian
    Zhou, Yan
    Yang, Qiong
    Liu, Xiaoling
    Huang, Lili
    Yu, Ping
    Chu, Shuyuan
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [3] Machine learning models for screening carotid atherosclerosis in asymptomatic adults
    Jian Yu
    Yan Zhou
    Qiong Yang
    Xiaoling Liu
    Lili Huang
    Ping Yu
    Shuyuan Chu
    Scientific Reports, 11
  • [4] Development and validation of an imageless machine-learning algorithm for the initial screening of prostate cancer
    Martelin, Nicolas
    De Witt, Brian
    Chen, Benjamin
    Eschwege, Pascal
    PROSTATE, 2024, 84 (09): : 842 - 849
  • [5] Explainable Machine Learning for Lung Cancer Screening Models
    Kobylinska, Katarzyna
    Orlowski, Tadeusz
    Adamek, Mariusz
    Biecek, Przemyslaw
    APPLIED SCIENCES-BASEL, 2022, 12 (04):
  • [6] Development and validation of echocardiography-based machine-learning models to predict mortality
    Valsaraj, Akshay
    Kalmady, Sunil Vasu
    Sharma, Vaibhav
    Frost, Matthew
    Sun, Weijie
    Sepehrvand, Nariman
    Ong, Marcus
    Equilbec, Cyril
    Dyck, Jason R. B.
    Anderson, Todd
    Becher, Harald
    Weeks, Sarah
    Tromp, Jasper
    Hung, Chung-Lieh
    Ezekowitz, Justin A.
    Kaul, Padma
    EBIOMEDICINE, 2023, 90
  • [7] Development and Validation of a Machine-Learning Model to Predict Early Recurrence of Intrahepatic Cholangiocarcinoma
    Laura Alaimo
    Henrique A. Lima
    Zorays Moazzam
    Yutaka Endo
    Jason Yang
    Andrea Ruzzenente
    Alfredo Guglielmi
    Luca Aldrighetti
    Matthew Weiss
    Todd W. Bauer
    Sorin Alexandrescu
    George A. Poultsides
    Shishir K. Maithel
    Hugo P. Marques
    Guillaume Martel
    Carlo Pulitano
    Feng Shen
    François Cauchy
    Bas Groot Koerkamp
    Itaru Endo
    Minoru Kitago
    Timothy M. Pawlik
    Annals of Surgical Oncology, 2023, 30 : 5406 - 5415
  • [8] Development and validation of a machine-learning prediction model to improve abdominal aortic aneurysm screening
    Salzler, Gregory G.
    Ryer, Evan J.
    Abdu, Robert W.
    Lanyado, Alon
    Sagiv, Tal
    Choman, Eran N.
    Tariq, Abdul A.
    Urick, Jim
    Mitchell, Elliot G.
    Maff, Rebecca M.
    Delong, Grant
    Shriner, Stacey L.
    Elmore, James R.
    Hasharon, Hod
    JOURNAL OF VASCULAR SURGERY, 2024, 79 (04) : 776 - 783
  • [9] Artificial Intelligence Screening of Medical School Applications: Development and Validation of a Machine-Learning Algorithm
    Triola, Marc M.
    Reinstein, Ilan
    Marin, Marina
    Gillespie, Colleen
    Abramson, Steven
    Grossman, Robert I.
    Rivera Jr, Rafael
    ACADEMIC MEDICINE, 2023, 98 (09) : 1036 - 1043
  • [10] Explainable Online Validation of Machine Learning Models for Practical Applications
    Fuhl, Wolfgang
    Rong, Yao
    Motz, Thomas
    Scheidt, Michael
    Hartel, Andreas
    Koch, Andreas
    Kasneci, Enkelejda
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3304 - 3311