Development of machine learning model for diagnostic disease prediction based on laboratory tests

被引:71
|
作者
Park, Dong Jin [1 ]
Park, Min Woo [2 ]
Lee, Homin [3 ]
Kim, Young-Jin [4 ]
Kim, Yeongsic [5 ]
Park, Young Hoon [6 ]
机构
[1] Ewha Womans Univ Korea, Coll Med, Dept Lab Med, Seoul, South Korea
[2] Catholic Univ Korea, Dept Lab Med, St Vincents Hosp, Seoul, South Korea
[3] Dept Res, Future Lab, Seoul, South Korea
[4] Pusan Natl Univ, Finance Fishery Manufacture Ind Math Ctr Big Data, Pusan, South Korea
[5] Catholic Univ Korea, Coll Med, Dept Lab Med, Seoul, South Korea
[6] Catholic Univ Korea, Coll Med, Dept Internal Med, Div Hematol, Seoul, South Korea
关键词
CONVOLUTIONAL NEURAL-NETWORKS; RANDOM FOREST; DEEP; SEQUENCE; GENE; CLASSIFICATION; REGULARIZATION; HEPATITIS;
D O I
10.1038/s41598-021-87171-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The use of deep learning and machine learning (ML) in medical science is increasing, particularly in the visual, audio, and language data fields. We aimed to build a new optimized ensemble model by blending a DNN (deep neural network) model with two ML models for disease prediction using laboratory test results. 86 attributes (laboratory tests) were selected from datasets based on value counts, clinical importance-related features, and missing values. We collected sample datasets on 5145 cases, including 326,686 laboratory test results. We investigated a total of 39 specific diseases based on the International Classification of Diseases, 10th revision (ICD-10) codes. These datasets were used to construct light gradient boosting machine (LightGBM) and extreme gradient boosting (XGBoost) ML models and a DNN model using TensorFlow. The optimized ensemble model achieved an F1-score of 81% and prediction accuracy of 92% for the five most common diseases. The deep learning and ML models showed differences in predictive power and disease classification patterns. We used a confusion matrix and analyzed feature importance using the SHAP value method. Our new ML model achieved high efficiency of disease prediction through classification of diseases. This study will be useful in the prediction and diagnosis of diseases.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Development of machine learning model for diagnostic disease prediction based on laboratory tests
    Park, D. J.
    Park, M.
    Kim, Y.
    Park, Y. H.
    [J]. CLINICA CHIMICA ACTA, 2022, 530 : S30 - S30
  • [2] Development of machine learning model for diagnostic disease prediction based on laboratory tests
    Dong Jin Park
    Min Woo Park
    Homin Lee
    Young-Jin Kim
    Yeongsic Kim
    Young Hoon Park
    [J]. Scientific Reports, 11
  • [3] COMPARISON AND DEVELOPMENT OF AN ENSEMBLE MACHINE LEARNING-BASED TOOL IN PREDICTION OF THE RISK OF CKD WITH MINIMAL LABORATORY TESTS
    Xiao, Jing
    Ding, Ruifeng
    Xu, Xiulin
    Su, Haoxuan
    Ye, Zhibin
    Sun, Tao
    Xing, Kaichen
    Ge, Jiacheng
    Zhou, Xinli
    Zhu, Sibo
    [J]. NEPHROLOGY, 2020, 25 : 20 - 20
  • [4] Machine learning-based diagnostic prediction of IgA nephropathy: model development and validation study
    Noda, Ryunosuke
    Ichikawa, Daisuke
    Shibagaki, Yugo
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01):
  • [5] DIABETES PREDICTION MODEL AND DIAGNOSTIC SYSTEM BASED ON MACHINE LEARNING ALGORITHMS
    Yu, H. P.
    Li, F. Y.
    Xie, Y. Q.
    Guo, M.
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2017, 121 : 53 - 53
  • [6] Development of Patent Technology Prediction Model Based on Machine Learning
    Lee, Chih-Wei
    Tao, Feng
    Ma, Yu-Yu
    Lin, Hung-Lung
    [J]. AXIOMS, 2022, 11 (06)
  • [7] Study on Machine Learning based Heart Disease Prediction Model
    Zhang, Shihan
    [J]. PROCEEDINGS OF 2023 4TH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE FOR MEDICINE SCIENCE, ISAIMS 2023, 2023, : 346 - 352
  • [8] Development of a diagnostic model for biliary atresia based on MMP7 and serological tests using machine learning
    Zhao, Yong
    Wang, An
    Wang, Dingding
    Sun, Dayan
    Zhao, Jiawei
    Zhang, Yanan
    Hua, Kaiyun
    Gu, Yichao
    Li, Shuangshuang
    Liao, Junmin
    Wang, Peize
    Sun, Jie
    Huang, Jinshi
    [J]. PEDIATRIC SURGERY INTERNATIONAL, 2024, 40 (01)
  • [9] Development of a Forest Fire Diagnostic Model Based on Machine Learning Techniques
    Roh, Minwoo
    Lee, Sujong
    Jo, Hyun-Woo
    Lee, Woo-Kyun
    [J]. FORESTS, 2024, 15 (07):
  • [10] Machine learning model for diagnostic method prediction in parasitic disease using clinical information
    Lee, You Won
    Choi, Jae Woo
    Shin, Eun-Hee
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 185