Comparative analysis, applications, and interpretation of electronic health record-based stroke phenotyping methods

被引:8
|
作者
Thangaraj, Phyllis M. [1 ,2 ]
Kummer, Benjamin R. [3 ]
Lorberbaum, Tal [1 ,2 ]
Elkind, Mitchell S., V [4 ,5 ]
Tatonetti, Nicholas P. [1 ,2 ]
机构
[1] Columbia Univ, Dept Biomed Informat, 622 W 168th St,PH 20, New York, NY 10032 USA
[2] Columbia Univ, Dept Syst Biol, New York, NY 10027 USA
[3] Icahn Sch Med Mt Sinai, Dept Neurol, New York, NY USA
[4] Columbia Univ, Vagelos Coll Phys & Surg, Dept Neurol, New York, NY USA
[5] Columbia Univ, Mailman Sch Publ Hlth, Dept Epidemiol, New York, NY USA
关键词
Phenotyping algorithms; Acute ischemic stroke; Machine learning; Electronic health record studies; BIG DATA; DIAGNOSIS; MODELS; RISK;
D O I
10.1186/s13040-020-00230-x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background Accurate identification of acute ischemic stroke (AIS) patient cohorts is essential for a wide range of clinical investigations. Automated phenotyping methods that leverage electronic health records (EHRs) represent a fundamentally new approach cohort identification without current laborious and ungeneralizable generation of phenotyping algorithms. We systematically compared and evaluated the ability of machine learning algorithms and case-control combinations to phenotype acute ischemic stroke patients using data from an EHR. Materials and methods Using structured patient data from the EHR at a tertiary-care hospital system, we built and evaluated machine learning models to identify patients with AIS based on 75 different case-control and classifier combinations. We then estimated the prevalence of AIS patients across the EHR. Finally, we externally validated the ability of the models to detect AIS patients without AIS diagnosis codes using the UK Biobank. Results Across all models, we found that the mean AUROC for detecting AIS was 0.963 +/- 0.0520 and average precision score 0.790 +/- 0.196 with minimal feature processing. Classifiers trained with cases with AIS diagnosis codes and controls with no cerebrovascular disease codes had the best average F1 score (0.832 +/- 0.0383). In the external validation, we found that the top probabilities from a model-predicted AIS cohort were significantly enriched for AIS patients without AIS diagnosis codes (60-150 fold over expected). Conclusions Our findings support machine learning algorithms as a generalizable way to accurately identify AIS patients without using process-intensive manual feature curation. When a set of AIS patients is unavailable, diagnosis codes may be used to train classifier models.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] An Electronic Health Record-Based Strategy to Address Child Tobacco Smoke Exposure
    Mahabee-Gittens, E. Melinda
    Dexheimer, Judith W.
    Tabangin, Meredith
    Khoury, Jane C.
    Merianos, Ashley L.
    Stone, Lara
    Meyers, Gabe T.
    Gordon, Judith S.
    [J]. AMERICAN JOURNAL OF PREVENTIVE MEDICINE, 2018, 54 (01) : 64 - 71
  • [42] Finding Dental Harm to Patients through Electronic Health Record-Based Triggers
    Walji, M. F.
    Yansane, A.
    Hebballi, N. B.
    Ibarra-Noriega, A. M.
    Kookal, K. K.
    Tungare, S.
    Kent, K.
    McPharlin, R.
    Delattre, V
    Obadan-Udoh, E.
    Tokede, O.
    White, J.
    Kalenderian, E.
    [J]. JDR CLINICAL & TRANSLATIONAL RESEARCH, 2020, 5 (03) : 271 - 277
  • [43] Effect of Electronic Health Record-Based Coaching on Weight Maintenance A Randomized Trial
    Conroy, Molly B.
    McTigue, Kathleen M.
    Bryce, Cindy L.
    Tudorascu, Dana
    Gibbs, Bethany Barone
    Arnold, Jonathan
    Comer, Diane
    Hess, Rachel
    Huber, Kimberly
    Simkin-Silverman, Laurey R.
    Fischer, Gary S.
    [J]. ANNALS OF INTERNAL MEDICINE, 2019, 171 (11) : 777 - +
  • [44] Electronic health record-based triggers to detect potential delays in cancer diagnosis
    Murphy, Daniel R.
    Laxmisan, Archana
    Reis, Brian A.
    Thomas, Eric J.
    Esquivel, Adol
    Forjuoh, Samuel N.
    Parikh, Rohan
    Khan, Myrna M.
    Singh, Hardeep
    [J]. BMJ QUALITY & SAFETY, 2014, 23 (01) : 8 - 16
  • [45] Context and Approach in Reporting Evaluations of Electronic Health Record-Based Implementation Projects
    Haynes, R. Brian
    del Fiol, Guilherme
    Michelson, Matthew
    Iorio, Alfonso
    [J]. ANNALS OF INTERNAL MEDICINE, 2020, 172 : S73 - S78
  • [46] Development and Validation of an Electronic Health Record-based Score for Triage to Perioperative Medicine
    Le, Sidney T.
    Corbin, J. Dalton
    Myers, Laura C.
    Kipnis, Patricia
    Cohn, Bradley
    Liu, Vincent X.
    [J]. ANNALS OF SURGERY, 2023, 277 (03) : E520 - E527
  • [47] Variations in Electronic Health Record-Based Definitions of Diabetic Retinopathy Cohorts A Literature Review and Quantitative Analysis
    Chen, Jimmy S.
    Copado, Ivan A.
    Vallejos, Cecilia
    Kalaw, Fritz Gerald P.
    Soe, Priyanka
    Cai, Cindy X.
    Toy, Brian C.
    Borkar, Durga
    Sun, Catherine Q.
    Shantha, Jessica G.
    Baxter, Sally L.
    [J]. OPHTHALMOLOGY SCIENCE, 2024, 4 (04):
  • [48] Diagnostic evaluation of patients presenting with hematuria: An electronic health record-based study
    Richards, Kyle A.
    Ruiz, Vania Lopez
    Murphy, Daniel R.
    Downs, Tracy M.
    Abel, E. Jason
    Jarrard, David F.
    Singh, Hardeep
    [J]. UROLOGIC ONCOLOGY-SEMINARS AND ORIGINAL INVESTIGATIONS, 2018, 36 (03) : 88.e19 - 88.e25
  • [49] Concept libraries for automatic electronic health record based phenotyping: A review
    Almowil, Zahra A.
    Zhou, Shang-Ming
    Brophy, Sinead
    [J]. INTERNATIONAL JOURNAL OF POPULATION DATA SCIENCE (IJPDS), 2021, 6 (01):
  • [50] Effectiveness of an Electronic Health Record-based Intervention to Improve Follow-up of Abnormal Pathology Results A Retrospective Record Analysis
    Laxmisan, Archana
    Sittig, Dean F.
    Pietz, Kenneth
    Espadas, Donna
    Krishnan, Bhuvaneswari
    Singh, Hardeep
    [J]. MEDICAL CARE, 2012, 50 (10) : 898 - 904