Enhancing patient representation learning with inferred family pedigrees improves disease risk prediction

被引:0
|
作者
Huang, Xiayuan [1 ]
Arora, Jatin [2 ]
Erzurumluoglu, Abdullah Mesut [2 ]
Stanhope, Stephen A. [3 ]
Lam, Daniel [4 ]
Arora, Jatin [2 ]
Erzurumluoglu, Abdullah Mesut [2 ]
Lam, Daniel [4 ]
Khoueiry, Pierre
Jensen, Jan N.
Cai, James
Lawless, Nathan
Kriegl, Jan
Ding, Zhihao
de Jong, Johann [6 ,7 ]
Zhao, Hongyu [1 ]
Ding, Zhihao
Wang, Zuoheng [1 ,2 ,5 ]
de Jong, Johann [6 ,7 ]
机构
[1] Yale Univ, Sch Publ Hlth, Dept Biostat, New Haven, CT 06510 USA
[2] Boehringer Ingelheim Pharm GmbH & Co KG, Global Computat Biol & Digital Sci, Human Genet, D-88400 Biberach, Germany
[3] Boehringer Ingelheim GmbH & Co KG, Real World Data & Analyt, Global Med Affairs, Ridgefield, CT 06877 USA
[4] Boehringer Ingelheim Pharm GmbH & Co KG, CB CMDR, Global Computat Biol & Digital Sci, D-88400 Biberach, Germany
[5] Yale Univ, Sch Med, Dept Biomed Informat & Data Sci, New Haven, CT 06510 USA
[6] Boehringer Ingelheim Pharm GmbH & Co KG, Global Computat Biol & Digital Sci, Stat Modeling, D-88400 Biberach, Germany
[7] UCB Biosci GmbH, Adv Analyt Patient Solut, D-40789 Monheim, Germany
关键词
electronic health records; patient modeling; disease risk prediction; graph attention networks; ULCERATIVE-COLITIS; HERITABILITY; HISTORY; RECORD;
D O I
10.1093/jamia/ocae297
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Background Machine learning and deep learning are powerful tools for analyzing electronic health records (EHRs) in healthcare research. Although family health history has been recognized as a major predictor for a wide spectrum of diseases, research has so far adopted a limited view of family relations, essentially treating patients as independent samples in the analysis.Methods To address this gap, we present ALIGATEHR, which models inferred family relations in a graph attention network augmented with an attention-based medical ontology representation, thus accounting for the complex influence of genetics, shared environmental exposures, and disease dependencies.Results Taking disease risk prediction as a use case, we demonstrate that explicitly modeling family relations significantly improves predictions across the disease spectrum. We then show how ALIGATEHR's attention mechanism, which links patients' disease risk to their relatives' clinical profiles, successfully captures genetic aspects of diseases using longitudinal EHR diagnosis data. Finally, we use ALIGATEHR to successfully distinguish the 2 main inflammatory bowel disease subtypes with highly shared risk factors and symptoms (Crohn's disease and ulcerative colitis).Conclusion Overall, our results highlight that family relations should not be overlooked in EHR research and illustrate ALIGATEHR's great potential for enhancing patient representation learning for predictive and interpretable modeling of EHRs.
引用
收藏
页码:435 / 446
页数:12
相关论文
共 50 条
  • [1] Enhancing diagnosis prediction with adaptive disease representation learning
    Cheng, Hengliang
    Li, Shibo
    Shen, Tao
    Li, Weihua
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2025, 163
  • [2] Multifactorial disease risk calculator: Risk prediction for multifactorial disease pedigrees
    Campbell, Desmond D.
    Li, Yiming
    Sham, Pak C.
    GENETIC EPIDEMIOLOGY, 2018, 42 (02) : 130 - 133
  • [3] EAPR: explainable and augmented patient representation learning for disease prediction
    Zhang, Jiancheng
    Xu, Yonghui
    Ye, Bicui
    Zhao, Yibowen
    Sun, Xiaofang
    Meng, Qi
    Zhang, Yang
    Cui, Lizhen
    HEALTH INFORMATION SCIENCE AND SYSTEMS, 2023, 11 (01)
  • [4] EAPR: explainable and augmented patient representation learning for disease prediction
    Jiancheng Zhang
    Yonghui Xu
    Bicui Ye
    Yibowen Zhao
    Xiaofang Sun
    Qi Meng
    Yang Zhang
    Lizhen Cui
    Health Information Science and Systems, 11
  • [5] Cardiovascular Disease Risk Improves COVID-19 Patient Outcome Prediction
    Machado Reyes, Diego
    Chao, Hanqing
    Homayounieh, Fatemeh
    Hahn, Juergen
    Kalra, Mannudeep K.
    Yan, Pingkun
    MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2021, 2021, 12966 : 467 - 476
  • [6] Multi-perspective patient representation learning for disease prediction on electronic health records
    Yu, Ziyue
    Wang, Jiayi
    Luo, Wuman
    Tse, Rita
    Pau, Giovanni
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (12) : 7837 - 7858
  • [7] Characterizing personalized effects of family information on disease risk using graph representation learning
    Wharrie, Sophie
    Yang, Zhiyu
    Ganna, Andrea
    Kaski, Samuel
    MACHINE LEARNING FOR HEALTHCARE CONFERENCE, VOL 219, 2023, 219
  • [8] Enhancing CTR Prediction with Context-Aware Feature Representation Learning
    Wang, Fangye
    Wang, Yingxu
    Li, Dongsheng
    Gu, Hansu
    Lu, Tun
    Zhang, Peng
    Gu, Ning
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 343 - 352
  • [9] Enhancing HLS Performance Prediction on FPGAs Through Multimodal Representation Learning
    Shang, Longshan
    Wang, Teng
    Gong, Lei
    Wang, Chao
    Zhou, Xuehai
    IEEE EMBEDDED SYSTEMS LETTERS, 2024, 16 (04) : 385 - 388
  • [10] Artificial intelligence improves risk prediction in cardiovascular disease
    Teshale, Achamyeleh Birhanu
    Htun, Htet Lin
    Vered, Mor
    Owen, Alice J.
    Ryan, Joanne
    Tonkin, Andrew
    Freak-Poli, Rosanne
    GEROSCIENCE, 2024,