Using Machine Learning to Identify Health Outcomes from Electronic Health Record Data

被引:46
|
作者
Wong, Jenna [1 ,2 ]
Murray Horwitz, Mara [1 ,2 ]
Zhou, Li [3 ,4 ]
Toh, Sengwee [1 ,2 ]
机构
[1] Harvard Med Sch, Dept Populat Med, 401 Pk Dr,Suite 401 East, Boston, MA 02215 USA
[2] Harvard Pilgrim Hlth Care Inst, 401 Pk Dr,Suite 401 East, Boston, MA 02215 USA
[3] Brigham & Womens Hosp, Div Gen Internal Med & Primary Care, 75 Francis St, Boston, MA 02115 USA
[4] Harvard Med Sch, Boston, MA 02115 USA
基金
美国医疗保健研究与质量局;
关键词
Electronic health records; Machine learning; Health outcomes; Phenotyping; Cohort identification; MEDICAL-RECORDS; NEURAL-NETWORKS; TASK-FORCE; BIG DATA; CLASSIFICATION; TEXT; GUIDELINE; DIAGNOSIS; CRITERIA; COLLEGE;
D O I
10.1007/s40471-018-0165-9
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Purpose of ReviewElectronic health records (EHRs) contain valuable data for identifying health outcomes, but these data also present numerous challenges when creating computable phenotyping algorithms. Machine learning methods could help with some of these challenges. In this review, we discuss four common scenarios that researchers may find helpful for thinking critically about when and for what tasks machine learning may be used to identify health outcomes from EHR data.Recent FindingsWe first consider the conditions in which machine learning may be especially useful with respect to two dimensions of a health outcome: (1) the characteristics of its diagnostic criteria and (2) the format in which its diagnostic data are usually stored within EHR systems. In the first dimension, we propose that for health outcomes with diagnostic criteria involving many clinical factors, vague definitions, or subjective interpretations, machine learning may be useful for modeling the complex diagnostic decision-making process from a vector of clinical inputs to identify individuals with the health outcome. In the second dimension, we propose that for health outcomes where diagnostic information is largely stored in unstructured formats such as free text or images, machine learning may be useful for extracting and structuring this information as part of a natural language processing system or an image recognition task. We then consider these two dimensions jointly to define four common scenarios of health outcomes. For each scenario, we discuss the potential uses for machine learning-first assuming accurate and complete EHR data and then relaxing these assumptions to accommodate the limitations of real-world EHR systems. We illustrate these four scenarios using concrete examples and describe how recent studies have used machine learning to identify these health outcomes from EHR data.SummaryMachine learning has great potential to improve the accuracy and efficiency of health outcome identification from EHR systems, especially under certain conditions. To promote the use of machine learning in EHR-based phenotyping tasks, future work should prioritize efforts to increase the transportability of machine learning algorithms for use in multi-site settings.
引用
收藏
页码:331 / 342
页数:12
相关论文
共 50 条
  • [21] Prediction of Atherosclerotic Cardiovascular Disease Risk Using Machine Learning and Electronic Health Record Data
    Ward, Andrew
    Sarraju, Ashish
    Chung, Sukyung
    Palaniappan, Latha
    Scheinker, David
    Rodriguez, Fatima
    [J]. CIRCULATION, 2019, 140
  • [22] Postoperative delirium prediction using machine learning models and preoperative electronic health record data
    Bishara, Andrew
    Chiu, Catherine
    Whitlock, Elizabeth L.
    Douglas, Vanja C.
    Lee, Sei
    Butte, Atul J.
    Leung, Jacqueline M.
    Donovan, Anne L.
    [J]. BMC ANESTHESIOLOGY, 2022, 22 (01)
  • [23] Predicting Intensive Care Unit Readmission with Machine Learning Using Electronic Health Record Data
    Rojas, Juan C.
    Carey, Kyle A.
    Edelson, Dana P.
    Venable, Laura R.
    Howell, Michael D.
    Churpek, Matthew M.
    [J]. ANNALS OF THE AMERICAN THORACIC SOCIETY, 2018, 15 (07) : 846 - 853
  • [24] USING MACHINE LEARNING TO IDENTIFY NON-METASTATIC CASTRATION-RESISTANT PROSTATE CANCER (NMCRPC) PATIENTS FROM ELECTRONIC HEALTH RECORD DATA
    Patil, V
    Rasmussen, K.
    Morreall, D.
    Li, C.
    Yong, C.
    Appukkuttan, S.
    Partridge, J.
    Jhaveri, J.
    Halwani, A. S.
    [J]. VALUE IN HEALTH, 2022, 25 (07) : S603 - S603
  • [25] Performance of a Machine Learning Algorithm Using Electronic Health Record Data to Identify and Estimate Survival in a Longitudinal Cohort of Patients With Lung Cancer
    Yuan, Qianyu
    Cai, Tianrun
    Hong, Chuan
    Du, Mulong
    Johnson, Bruce E.
    Lanuti, Michael
    Cai, Tianxi
    Christiani, David C.
    [J]. JAMA NETWORK OPEN, 2021, 4 (07)
  • [26] Learning About Machine Learning: The Promise and Pitfalls of Big Data and the Electronic Health Record
    Deo, Rahul C.
    Nallamothu, Brahmajee K.
    [J]. CIRCULATION-CARDIOVASCULAR QUALITY AND OUTCOMES, 2016, 9 (06): : 618 - 620
  • [27] Applications of Machine Learning on Electronic Health Record Data to Combat Antibiotic Resistance
    Blechman, Samuel E.
    Wright, Erik S.
    [J]. JOURNAL OF INFECTIOUS DISEASES, 2024,
  • [28] Prediction of early childhood obesity with machine learning and electronic health record data
    Pang, Xueqin
    Forrest, Christopher B.
    Le-Scherban, Felice
    Masino, Aaron J.
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2021, 150
  • [29] An algorithm to identify residential mobility from electronic health-record data
    Meeker, Jessica R.
    Burris, Heather
    Boland, Mary Regina
    [J]. INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2021, 50 (06) : 2048 - 2057
  • [30] Using Machine Learning Approaches for Emergency Room Visit Prediction Based on Electronic Health Record Data
    Qiao, Zhi
    Sun, Ning
    Li, Xiang
    Xia, Eryu
    Zhao, Shiwan
    Qin, Yong
    [J]. BUILDING CONTINENTS OF KNOWLEDGE IN OCEANS OF DATA: THE FUTURE OF CO-CREATED EHEALTH, 2018, 247 : 111 - 115