High-throughput phenotyping with temporal sequences

被引:11
|
作者
Estiri, Hossein [1 ,2 ,3 ]
Strasser, Zachary H. [1 ,2 ,3 ]
Murphy, Shawn N. [1 ,2 ,3 ]
机构
[1] Harvard Med Sch, Boston, MA 02115 USA
[2] Massachusetts Gen Hosp, Boston, MA 02114 USA
[3] Mass Gen Brigham, Boston, MA USA
关键词
phenotyping; electronic health records; sequential pattern mining; temporal data representation; ELECTRONIC MEDICAL-RECORDS; EMERGE NETWORK; HEALTH RECORDS; HEART-FAILURE; RISK-FACTORS; DEMENTIA; REPRESENTATION; INFORMATION; ABSTRACTION; ALGORITHMS;
D O I
10.1093/jamia/ocaa288
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: High-throughput electronic phenotyping algorithms can accelerate translational research using data from electronic health record (EHR) systems. The temporal information buried in EHRs is often underutilized in developing computational phenotypic definitions. This study aims to develop a high-throughput phenotyping method, leveraging temporal sequential patterns from EHRs. Materials and Methods: We develop a representation mining algorithm to extract 5 classes of representations from EHR diagnosis and medication records: the aggregated vector of the records (aggregated vector representation), the standard sequential patterns (sequential pattern mining), the transitive sequential patterns (transitive sequential pattern mining), and 2 hybrid classes. Using EHR data on 10 phenotypes from the Mass General Brigham Biobank, we train and validate phenotyping algorithms. Results: Phenotyping with temporal sequences resulted in a superior classification performance across all 10 phenotypes compared with the standard representations in electronic phenotyping. The high-throughput algorithm's classification performance was superior or similar to the performance of previously published electronic phenotyping algorithms. We characterize and evaluate the top transitive sequences of diagnosis records paired with the records of risk factors, symptoms, complications, medications, or vaccinations. Discussion: The proposed high-throughput phenotyping approach enables seamless discovery of sequential record combinations that may be difficult to assume from raw EHR data. Transitive sequences offer more accurate characterization of the phenotype, compared with its individual components, and reflect the actual lived experiences of the patients with that particular disease. Conclusion: Sequential data representations provide a precise mechanism for incorporating raw EHR records into downstream machine learning. Our approach starts with user interpretability and works backward to the technology.
引用
收藏
页码:772 / 781
页数:10
相关论文
共 50 条
  • [1] High-throughput phenotyping
    Natalie de Souza
    [J]. Nature Methods, 2010, 7 (1) : 36 - 36
  • [2] High-throughput phenotyping
    Gehan, Malia A.
    Kellogg, Elizabeth A.
    [J]. AMERICAN JOURNAL OF BOTANY, 2017, 104 (04) : 505 - 508
  • [3] High-throughput phenotyping
    de Souza, Natalie
    [J]. NATURE METHODS, 2010, 7 (01) : 36 - 36
  • [4] High-throughput mouse phenotyping
    Gates, Hilary
    Mallon, Ann-Marie
    Brown, Steve D. M.
    [J]. METHODS, 2011, 53 (04) : 394 - 404
  • [5] High-throughput phenotyping of nematode cysts
    Chen, Long
    Daub, Matthias
    Luigs, Hans-Georg
    Jansen, Marcus
    Strauch, Martin
    Merhof, Dorit
    [J]. FRONTIERS IN PLANT SCIENCE, 2022, 13
  • [6] High-throughput phenotyping in cotton: a review
    Pabuayon, Irish Lorraine B.
    Sun Yazhou
    Guo Wenxuan
    Ritchie, Glen L.
    [J]. JOURNAL OF COTTON RESEARCH, 2019, 2 (1)
  • [7] High-throughput hyperdimensional vertebrate phenotyping
    Pardo-Martin, Carlos
    Allalou, Amin
    Medina, Jaime
    Eimon, Peter M.
    Wahlby, Carolina
    Yanik, Mehmet Fatih
    [J]. NATURE COMMUNICATIONS, 2013, 4
  • [8] High-throughput hyperdimensional vertebrate phenotyping
    Carlos Pardo-Martin
    Amin Allalou
    Jaime Medina
    Peter M. Eimon
    Carolina Wählby
    Mehmet Fatih Yanik
    [J]. Nature Communications, 4
  • [9] High-throughput phenotyping in cotton:a review
    PABUAYON Irish Lorraine B.
    SUN Yazhou
    GUO Wenxuan
    RITCHIE Glen L.
    [J]. Journal of Cotton Research, 2019, 2 (03) : 174 - 182
  • [10] High-throughput phenotyping in cotton: a review
    Irish Lorraine B. PABUAYON
    Yazhou SUN
    Wenxuan GUO
    Glen L. RITCHIE
    [J]. Journal of Cotton Research, 2