Enhancing patient representation learning with inferred family pedigrees improves disease risk prediction

被引:0
|
作者
Huang, Xiayuan [1 ]
Arora, Jatin [2 ]
Erzurumluoglu, Abdullah Mesut [2 ]
Stanhope, Stephen A. [3 ]
Lam, Daniel [4 ]
Arora, Jatin [2 ]
Erzurumluoglu, Abdullah Mesut [2 ]
Lam, Daniel [4 ]
Khoueiry, Pierre
Jensen, Jan N.
Cai, James
Lawless, Nathan
Kriegl, Jan
Ding, Zhihao
de Jong, Johann [6 ,7 ]
Zhao, Hongyu [1 ]
Ding, Zhihao
Wang, Zuoheng [1 ,2 ,5 ]
de Jong, Johann [6 ,7 ]
机构
[1] Yale Univ, Sch Publ Hlth, Dept Biostat, New Haven, CT 06510 USA
[2] Boehringer Ingelheim Pharm GmbH & Co KG, Global Computat Biol & Digital Sci, Human Genet, D-88400 Biberach, Germany
[3] Boehringer Ingelheim GmbH & Co KG, Real World Data & Analyt, Global Med Affairs, Ridgefield, CT 06877 USA
[4] Boehringer Ingelheim Pharm GmbH & Co KG, CB CMDR, Global Computat Biol & Digital Sci, D-88400 Biberach, Germany
[5] Yale Univ, Sch Med, Dept Biomed Informat & Data Sci, New Haven, CT 06510 USA
[6] Boehringer Ingelheim Pharm GmbH & Co KG, Global Computat Biol & Digital Sci, Stat Modeling, D-88400 Biberach, Germany
[7] UCB Biosci GmbH, Adv Analyt Patient Solut, D-40789 Monheim, Germany
关键词
electronic health records; patient modeling; disease risk prediction; graph attention networks; ULCERATIVE-COLITIS; HERITABILITY; HISTORY; RECORD;
D O I
10.1093/jamia/ocae297
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Background Machine learning and deep learning are powerful tools for analyzing electronic health records (EHRs) in healthcare research. Although family health history has been recognized as a major predictor for a wide spectrum of diseases, research has so far adopted a limited view of family relations, essentially treating patients as independent samples in the analysis.Methods To address this gap, we present ALIGATEHR, which models inferred family relations in a graph attention network augmented with an attention-based medical ontology representation, thus accounting for the complex influence of genetics, shared environmental exposures, and disease dependencies.Results Taking disease risk prediction as a use case, we demonstrate that explicitly modeling family relations significantly improves predictions across the disease spectrum. We then show how ALIGATEHR's attention mechanism, which links patients' disease risk to their relatives' clinical profiles, successfully captures genetic aspects of diseases using longitudinal EHR diagnosis data. Finally, we use ALIGATEHR to successfully distinguish the 2 main inflammatory bowel disease subtypes with highly shared risk factors and symptoms (Crohn's disease and ulcerative colitis).Conclusion Overall, our results highlight that family relations should not be overlooked in EHR research and illustrate ALIGATEHR's great potential for enhancing patient representation learning for predictive and interpretable modeling of EHRs.
引用
收藏
页码:435 / 446
页数:12
相关论文
共 50 条
  • [21] Dietary information improves cardiovascular disease risk prediction models
    I Baik
    N H Cho
    S H Kim
    C Shin
    European Journal of Clinical Nutrition, 2013, 67 : 25 - 30
  • [22] Dietary information improves cardiovascular disease risk prediction models
    Baik, I.
    Cho, N. H.
    Kim, S. H.
    Shin, C.
    EUROPEAN JOURNAL OF CLINICAL NUTRITION, 2013, 67 (01) : 25 - 30
  • [23] Strategic Machine Learning Optimization for Cardiovascular Disease Prediction and High-Risk Patient Identification
    Tompra, Konstantina-Vasiliki
    Papageorgiou, George
    Tjortjis, Christos
    ALGORITHMS, 2024, 17 (05)
  • [24] Representation learning in intraoperative vital signs for heart failure risk prediction
    Chen, Yuwen
    Qi, Baolian
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (01)
  • [25] Representation learning in intraoperative vital signs for heart failure risk prediction
    Yuwen Chen
    Baolian Qi
    BMC Medical Informatics and Decision Making, 19
  • [26] Transfer learning with false negative control improves polygenic risk prediction
    Jeng, Xinge Jessie
    Hu, Yifei
    Venkat, Vaishnavi
    Lu, Tzu-Pin
    Tzeng, Jung-Ying
    PLOS GENETICS, 2023, 19 (11):
  • [27] Machine Learning Algorithm Improves Accuracy of Perioperative Risk Prediction Tools
    Terhune, J. H.
    Edge, S. B.
    Nurkin, S.
    ANNALS OF SURGICAL ONCOLOGY, 2019, 26 : S15 - S16
  • [29] Deep learning model improves COPD risk prediction and gene discovery
    Cosentino, Justin
    Hormozdiari, Farhad
    NATURE GENETICS, 2023, 55 (05) : 738 - 739
  • [30] Enhancing Healthcare: Machine Learning for Diabetes Prediction and Retinopathy Risk Evaluation
    Barakat, Ghinwa
    Hassan, Samer El Hajj
    Duong-Trung, Nghia
    Ramadan, Wiam
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (07) : 18 - 36