Interpretable Phenotyping for Electronic Health Records

被引:1
|
作者
Allen, Christine [1 ]
Hu, Juhua [2 ]
Kumar, Vikas [1 ]
Ahmad, Muhammad Aurangzeb [1 ]
Teredesai, Ankur [2 ]
机构
[1] KenSci Inc, Seattle, WA 98104 USA
[2] Univ Washington, Ctr Data Sci, Sch Engn & Technol, Tacoma, WA USA
关键词
EHRs; High-Dimensionality; Data Phenotyping; Unsupervised Learning; Interpretable Phenotyping;
D O I
10.1109/ICHI52183.2021.00034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Datasets from Electronic Health Records (EHRs) are increasingly large and complex, creating challenges in their use for predictive modeling. The two major challenges are large-scale and high-dimensionality. One of the common way to address the large-scale challenge is through use of data phenotypes: clinically relevant characteristic groupings that can be expressed as logical queries (e.g., "senior patients with diabetes"). With the increasing use of machine learning across the continuum of care, phenotypes play an important role in modeling for population management, clinical trials, observational and interventional research, and quality measures. Yet, phenotype interpretation can often be difficult and require post-hoc clarifications from experienced clinicians. For example, detailed analysis may be needed to find that all patients in a a phenotype are diabetic seniors with complications from previous surgery. Moreover, the high-dimensionality problem is often addressed either separately or simultaneously with phenotyping by dimension reduction methods that may further hinder interpretability. In this paper, we introduce the notion of interpretable data phenotypes generated by an unsupervised learning technique. Methods are designed to disambiguate relative feature memberships, thus facilitating general clinical validation, and alleviating the problem of high-dimensionality. The empirical study applies the proposed unsupervised interpretable phenotyping method to a real world healthcare dataset (MIMIC), then uses hospital length of stay as a reference prediction task. The results demonstrate that the proposed method produces phenotypes with improved interpretability and without diminishing the quality of prediction results.
引用
收藏
页码:161 / 170
页数:10
相关论文
共 50 条
  • [1] Scalable and Interpretable Predictive Models for Electronic Health Records
    Fejza, Amela
    Geneves, Pierre
    Layaida, Nabil
    Bosson, Jean-Luc
    [J]. 2018 IEEE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2018, : 341 - 350
  • [2] The use of electronic health records for psychiatric phenotyping and genomics
    Smoller, Jordan W.
    [J]. AMERICAN JOURNAL OF MEDICAL GENETICS PART B-NEUROPSYCHIATRIC GENETICS, 2018, 177 (07) : 601 - 612
  • [3] Next-generation phenotyping of electronic health records
    Hripcsak, George
    Albers, David J.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2013, 20 (01) : 117 - 121
  • [4] High Throughput Phenotyping for Dimensional Psychopathology in Electronic Health Records
    McCoy, Thomas H., Jr.
    Yu, Sheng
    Hart, Kamber L.
    Castro, Victor M.
    Brown, Hannah E.
    Rosenquist, James N.
    Doyle, Alysa E.
    Vuijk, Pieter J.
    Cai, Tianxi
    Perlis, Roy H.
    [J]. BIOLOGICAL PSYCHIATRY, 2018, 83 (12) : 997 - 1004
  • [5] A Review of Automatic Phenotyping Approaches using Electronic Health Records
    Alzoubi, Hadeel
    Alzubi, Raid
    Ramzan, Naeem
    West, Daune
    Al-Hadhrami, Tawfik
    Alazab, Mamoun
    [J]. ELECTRONICS, 2019, 8 (11)
  • [6] Bayesian Double Feature Allocation for Phenotyping With Electronic Health Records
    Ni, Yang
    Mueller, Peter
    Ji, Yuan
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (532) : 1620 - 1634
  • [7] Ascertaining and Phenotyping Suicidality at Scale Using Electronic Health Records
    Walsh, Colin
    Ripperger, Michael
    Wilimitis, Drew
    Ahmed, Ryan
    Kang, Jooeun
    Ruderfer, Douglas
    Morley, Theodore
    Bejan, Cosmin
    [J]. BIOLOGICAL PSYCHIATRY, 2022, 91 (09) : S30 - S31
  • [8] The Effectiveness of Multitask Learning for Phenotyping with Electronic Health Records Data
    Ding, Daisy Yi
    Simpson, Chloe
    Pfohl, Stephen
    Kale, Dave C.
    Jung, Kenneth
    Shah, Nigam H.
    [J]. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2019, 2019, : 18 - 29
  • [9] Machine learning approaches for electronic health records phenotyping: a methodical review
    Yang, Siyue
    Varghese, Paul
    Stephenson, Ellen
    Tu, Karen
    Gronsbell, Jessica
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2023, 30 (02) : 367 - 381
  • [10] Phenotyping issues for exploring electronic health records to design clinical trials
    Schnall, Jill
    Zhang, LingJiao
    Chen, Jinbo
    [J]. CLINICAL TRIALS, 2020, 17 (04) : 402 - 404