Variational Bayes latent class analysis for EHR-based phenotyping with large real-world data

被引:0
|
作者
Buckley, Brian [1 ]
O'Hagan, Adrian [1 ,2 ]
Galligan, Marie [3 ]
机构
[1] Univ Coll Dublin, Sch Math & Stat, Dublin, Ireland
[2] Univ Coll Dublin, Insight Ctr Data Analyt, Dublin, Ireland
[3] Univ Coll Dublin, Sch Med, Dublin, Ireland
关键词
variational Bayes; latent class analysis; patient phenotyping; real-world evidence; electronic health records;
D O I
10.3389/fams.2024.1302825
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Introduction Bayesian approaches to patient phenotyping in clinical observational studies have been limited by the computational challenges associated with applying the Markov Chain Monte Carlo (MCMC) approach to real-world data. Approximate Bayesian inference via optimization of the variational evidence lower bound, variational Bayes (VB), has been successfully demonstrated for other applications.Methods We investigate the performance and characteristics of currently available VB and MCMC software to explore the practicability of available approaches and provide guidance for clinical practitioners. Two case studies are used to fully explore the methods covering a variety of real-world data. First, we use the publicly available Pima Indian diabetes data to comprehensively compare VB implementations of logistic regression. Second, a large real-world data set, Optum (TM) EHR with approximately one million diabetes patients extended the analysis to large, highly unbalanced data containing discrete and continuous variables. A Bayesian patient phenotyping composite model incorporating latent class analysis (LCA) and regression was implemented with the second case study.Results We find that several data characteristics common in clinical data, such as sparsity, significantly affect the posterior accuracy of automatic VB methods compared with conditionally conjugate mean-field methods. We find that for both models, automatic VB approaches require more effort and technical knowledge to set up for accurate posterior estimation and are very sensitive to stopping time compared with closed-form VB methods.Discussion Our results indicate that the patient phenotyping composite Bayes model is more easily usable for real-world studies if Monte Carlo is replaced with VB. It can potentially become a uniquely useful tool for decision support, especially for rare diseases where gold-standard biomarker data are sparse but prior knowledge can be used to assist model diagnosis and may suggest when biomarker tests are warranted.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Large-scale real-world data analysis identifies comorbidity patterns in schizophrenia
    Chenyue Lu
    Di Jin
    Nathan Palmer
    Kathe Fox
    Isaac S. Kohane
    Jordan W. Smoller
    Kun-Hsing Yu
    Translational Psychiatry, 12
  • [22] Large-scale real-world data analysis identifies comorbidity patterns in schizophrenia
    Lu, Chenyue
    Jin, Di
    Palmer, Nathan
    Fox, Kathe
    Kohane, Isaac S.
    Smoller, Jordan W.
    Yu, Kun-Hsing
    TRANSLATIONAL PSYCHIATRY, 2022, 12 (01)
  • [23] Latent Class Choice Model of Heterogeneous Drivers' Route Choice Behavior Based on Learning in a Real-World Experiment
    Tawfik, Aly M.
    Rakha, Hesham A.
    TRANSPORTATION RESEARCH RECORD, 2013, (2334) : 84 - 94
  • [24] The unforeseen business and medical consequences of EHR data collection for a real-world data multiple myeloma registry.
    Keogh, Kevin M.
    Belli, Andrew J.
    Matta, Monica M.
    Tanenbaum, Kathryn A.
    Farrish, Kaeleigh
    Mulcahy, Michael
    Mathura, Shivam
    Williams, Christopher
    Mehr, Shaadi
    Auclair, Daniel
    Norden, Andrew David
    Labkoff, Steven E.
    JOURNAL OF CLINICAL ONCOLOGY, 2019, 37 (15)
  • [25] HEART FAILURE PHENOTYPING BY LATENT CLASS ANALYSIS IDENTIFIES SUBPOPULATIONS AT HIGH RISK OF MORTALITY AND READMISSIONS: INSIGHTS FROM A REAL WORLD DATABASE
    Russo, Cesare
    Shao, Xiao
    Guo, Zhenchao
    Jin, Chelsea
    Burns, Leah
    Goshorn, Alice
    Doddamani, Sanjay
    Christianson, Anastasia
    DeSouza, Mary
    JOURNAL OF THE AMERICAN COLLEGE OF CARDIOLOGY, 2017, 69 (11) : 778 - 778
  • [26] Use of biologics in patients with psoriasis - A retrospective analysis based on real-world data
    Pan, Jing
    Chang, Xiaodan
    Wang, Lingyan
    Miao, Gang
    Jin, Qiuzi
    Guo, Ningning
    Zhang, Jiayu
    Lv, Yanwei
    Wang, Lifang
    SKIN RESEARCH AND TECHNOLOGY, 2024, 30 (01)
  • [27] Distributed Prony analysis for real-world PMU data
    Khazaei, Javad
    Fan, Lingling
    Jiang, Weiqing
    Manjure, Durgesh
    ELECTRIC POWER SYSTEMS RESEARCH, 2016, 133 : 113 - 120
  • [28] ASSOCIATION BETWEEN PREGNANT CIGARETTE SMOKING AND OFFSPRING BIRTHWEIGHT USING ORACLE EHR REAL-WORLD DATA
    Perkowski, K.
    Taylor, R.
    Yang, L.
    VALUE IN HEALTH, 2024, 27 (06) : S20 - S20
  • [29] Evaluation of electronic health record (EHR) data of patients with acute myeloid leukaemia (AML) for real-world data analyses
    Fruchtenicht, Charlotta
    Flahavan, Evelyn M.
    Xu, Tao
    El-Galaly, Tarec Christoffer
    Davies, Jessica
    Gower-Page, Craig
    Meyer, Anne-Marie
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2020, 29 : 43 - 43
  • [30] Real-world clusters of severe asthma from the Italian Registry on Severe Asthma (IRSA): a longitudinal Latent Class Analysis
    Bilo, Maria Beatrice
    Martini, Matteo
    Antonicelli, Leonardo
    De Michele, Fausto
    Vaghi, Adriano
    Musarra, Antonino
    Micheletto, Claudio
    EUROPEAN RESPIRATORY JOURNAL, 2024, 64