Integration of the Extreme Gradient Boosting model with electronic health records to enable the early diagnosis of multiple sclerosis

被引:4
|
作者
Wang, Ruoning [1 ]
Luo, Wenjing [2 ]
Liu, Zifeng [3 ]
Liu, Weilong [4 ]
Liu, Chunxin [2 ]
Liu, Xun [3 ]
Zhu, He [5 ]
Li, Rui [2 ]
Song, Jiafang [5 ]
Hu, Xueqiang [2 ]
Han, Sheng [5 ]
Qiu, Wei [2 ]
机构
[1] Peking Univ, Dept Continuing Med Educ, Hlth Sci Ctr, Beijing, Peoples R China
[2] Sun Yat Sen Univ, Dept Neurol, Affiliated Hosp 3, Guangzhou, Peoples R China
[3] Sun Yat Sen Univ, Dept Clin Data Ctr, Affiliated Hosp 3, Guangzhou, Peoples R China
[4] Chengdu Medlinker Sci & Technol Co Ltd, Med Data Operat Dept, Beijing, Peoples R China
[5] Peking Univ, Int Res Ctr Med Adm, Dept Real World Evidence & Pharmacoecon, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Baysian optimization; early diagnostics; machine learning algorithms; MS; XGBoost; ARTIFICIAL-INTELLIGENCE; COMORBIDITY; RISK; RECOGNITION; MANAGEMENT; INCREASES; DELAYS;
D O I
10.1016/j.msard.2020.102632
中图分类号
R74 [神经病学与精神病学];
学科分类号
摘要
Background: Delayed multiple sclerosis (MS) diagnoses are not uncommon, an early diagnostic tool is urgently warranted. We aimed to develop an effective tool through electronic health records and machine learning techniques to early recognize MS patients from hospital visitors in China. Methods: Two case sets were collected from January 2016 to December 2018. The training set had 239 MS and 1142 controls, and the test set had 23 MS and 92 controls. The utility of Extreme Gradient Boosting (XGBoost), Random Forest (RF), Naive Bayes, K-nearest-neighbor (KNN) and Support Vector Machine (SVM) in early diagnosis of MS was evaluated by the area under curve of receiver operating characteristic, precision, recall, specificity, accuracy and F1 score. Results: The XGBoost performed the best and was used to generate the results. Thirty-four variables which were highly relevant to MS diagnosis were set for the XGBoost model, and their relative importance with MS were ranked. The training set recall was 0.632, with a precision of 0.576, and the test set recall was 0.609, with a precision of 0.609. Our study found that 61%, 51%, and 49% of the patients could be diagnosed with MS, 1, 2, and 3 years earlier than their real diagnostic time point, respectively. Conclusions: A diagnostic tool for early MS recognition based on the XGBoost model and electronic health records were developed to help reduce diagnostic delays in MS.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Integration of the extreme gradient boosting model with clinical data to enable the early diagnosis of multiple sclerosis
    Qiu, W.
    Wang, R.
    Luo, W.
    Liu, Z.
    Liu, W.
    Liu, C.
    Liu, X.
    Zhu, H.
    Li, R.
    Song, J.
    Hu, X.
    Han, S.
    [J]. MULTIPLE SCLEROSIS JOURNAL, 2020, 26 (3_SUPPL) : 122 - 122
  • [2] Integration Services to Enable Regional Shared Electronic Health Records
    Oliveira, Ilidio C.
    Cunha, Joao P. S.
    [J]. USER CENTRED NETWORKED HEALTH CARE, 2011, 169 : 310 - 314
  • [3] Leveraging Electronic Health Records for Research in Multiple Sclerosis
    Xia, Zongqi
    Bove, Riley
    Cai, Tianxi
    Cheng, Suchun
    Perez, Raul N. G.
    Gainer, Vivian S.
    Murphy, Shawn N.
    Chen, Pei J.
    Savova, Guergana K.
    Liao, Katherine
    Karlson, Elizabeth W.
    Shaw, Stanley
    Ananthkrishnan, Ashwin N.
    Szolovits, Peter
    Churchill, Susanne E.
    Kohane, Issac S.
    Plenge, Robert M.
    De Jager, Philip L.
    [J]. ANNALS OF NEUROLOGY, 2012, 72 : S141 - S141
  • [4] Leveraging electronic health records for research in multiple sclerosis
    Xia, Z.
    Bove, R.
    Cai, T.
    Cheng, S.
    Perez, R.
    Gainer, V.
    Murphy, S.
    Chen, P.
    Savova, G.
    Liao, K.
    Karlson, E.
    Shaw, S.
    Ananthakrishnan, A.
    Szolovits, P.
    Churchill, S.
    Kohane, I.
    Plenge, R.
    De Jager, P.
    [J]. MULTIPLE SCLEROSIS JOURNAL, 2012, 18 : 92 - 93
  • [5] Leveraging Electronic Health Records for Studying Multiple Sclerosis
    Xia, Zongqi
    Cai, Tianxi
    Cheng, Suchun
    Perez, Raul N. G.
    Gainer, Vivian S.
    Murphy, Shwan N.
    Chen, Pei J.
    Savova, Guergana K.
    Liao, Katherine P.
    Karlson, Elizabeth W.
    Ananthakrishnan, Ashwin N.
    Szolovitis, Peter
    Churchill, Susanne E.
    Kohane, Issac S.
    Plenge, Robert M.
    De Jager, Philip L.
    [J]. ANNALS OF NEUROLOGY, 2012, 72 : S119 - S119
  • [6] Integration of magnetic resonance imaging and protein and metabolite CSF measurements to enable early diagnosis of secondary progressive multiple sclerosis
    Herman, Stephanie
    Khoonsari, Payam Emami
    Tolf, Andreas
    Steinmetz, Julia
    Zetterberg, Henrik
    Akerfeldt, Torbjorn
    Jakobsson, Per-Johan
    Larsson, Anders
    Spjuth, Ola
    Burman, Joachim
    Kultima, Kim
    [J]. THERANOSTICS, 2018, 8 (16): : 4477 - 4490
  • [7] Leveraging Electronic Health Records Data to Predict Multiple Sclerosis Activity
    Ahuja, Y. V.
    Kim, N.
    Liang, L.
    Cai, T.
    Dahal, K.
    Seyok, T.
    Lin, C.
    Finan, S.
    Liao, K.
    Savova, G.
    Chitnis, T.
    Cai, T.
    Xia, Z.
    [J]. MULTIPLE SCLEROSIS JOURNAL, 2021, 27 (1_SUPPL) : 15 - 16
  • [8] Leveraging Electronic Health Records for Modeling Disease Activity in Multiple Sclerosis
    Xia, Zongqi
    Chibnik, Lori
    Secor, Elizabeth
    De Jager, Philip
    [J]. NEUROLOGY, 2013, 80
  • [9] Modeling Disease Severity in Multiple Sclerosis Using Electronic Health Records
    Xia, Zongqi
    Secor, Elizabeth
    Chibnik, Lori B.
    Bove, Riley M.
    Cheng, Suchun
    Chitnis, Tanuja
    Cagan, Andrew
    Gainer, Vivian S.
    Chen, Pei J.
    Liao, Katherine P.
    Shaw, Stanley Y.
    Ananthakrishnan, Ashwin N.
    Szolovits, Peter
    Weiner, Howard L.
    Karlson, Elizabeth W.
    Murphy, Shawn N.
    Savova, Guergana K.
    Cai, Tianxi
    Churchill, Susanne E.
    Plenge, Robert M.
    Kohane, Isaac S.
    De Jager, Philip L.
    [J]. PLOS ONE, 2013, 8 (11):
  • [10] Characteristics of multiple sclerosis by ethnoracial groups in medicaid and electronic health records
    Grimes, Nydjie
    Hayflinger, Cortney
    Jones, Cynthia
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2020, 29 : 76 - 76