PATIENT RECRUITMENT USING ELECTRONIC HEALTH RECORDS UNDER SELECTION BIAS: A TWO-PHASE SAMPLING FRAMEWORK

被引:0
|
作者
Zhang, Guanghao [1 ]
Beesley, Lauren j. [2 ]
Mukherjee, Bhramar [1 ]
Shi, Xu [1 ]
机构
[1] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[2] Los Alamos Natl Lab, Stat Sci Grp, Los Alamos, NM 87545 USA
来源
ANNALS OF APPLIED STATISTICS | 2024年 / 18卷 / 03期
基金
美国国家科学基金会;
关键词
Auxiliary information; electronic health records; selection bias; study design; two- phase sampling; CAUSAL INFERENCE; DESIGN;
D O I
10.1214/23-AOAS1860
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Electronic health records (EHRs) are increasingly recognized as a costeffective resource for patient recruitment in clinical research. However, how to optimally select a cohort from millions of individuals to answer a scientific question of interest remains unclear. Consider a study to estimate the mean or mean difference of an expensive outcome. Inexpensive auxiliary covariates predictive of the outcome may often be available in patients' health records, presenting an opportunity to recruit patients selectively, which may improve efficiency in downstream analyses. In this paper we propose a two-phase sampling design that leverages available information on auxiliary covariates in EHR data. A key challenge in using EHR data for multiphase sampling is the potential selection bias, because EHR data are not necessarily representative of the target population. Extending existing literature on two-phase sampling design, we derive an optimal two-phase sampling method that improves efficiency over random sampling while accounting for the potential selection bias in EHR data. We demonstrate the efficiency gain from our sampling design via simulation studies and an application evaluating the prevalence of hypertension among U.S. adults leveraging data from the Michigan Genomics Initiative, a longitudinal biorepository in Michigan Medicine.
引用
收藏
页码:1858 / 1878
页数:21
相关论文
共 50 条
  • [1] Variance estimation under stratified two-phase sampling with applications to measurement bias
    Rao, JNK
    Sitter, RR
    [J]. SURVEY MEASUREMENT AND PROCESS QUALITY, 1997, : 753 - 768
  • [2] Meeting the challenges of patient recruitment: A role for electronic health records
    Ohmann C.
    Kuchinke W.
    [J]. International Journal of Pharmaceutical Medicine, 2007, 21 (4) : 263 - 270
  • [3] Patient perspectives on use of electronic health records for research recruitment
    Beskow, Laura M.
    Brelsford, Kathleen M.
    Hammack, Catherine M.
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2019, 19 (1)
  • [4] Patient perspectives on use of electronic health records for research recruitment
    Laura M. Beskow
    Kathleen M. Brelsford
    Catherine M. Hammack
    [J]. BMC Medical Research Methodology, 19
  • [5] Variance Estimation under Two-Phase Sampling
    Saegusa, Takumi
    [J]. SCANDINAVIAN JOURNAL OF STATISTICS, 2015, 42 (04) : 1078 - 1091
  • [6] Estimation of Mode Using Two-phase Sampling
    Lamichhane, Rajan
    Singh, Sarjinder
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2016, 45 (07) : 2586 - 2597
  • [7] WEIGHTED LIKELIHOOD ESTIMATION UNDER TWO-PHASE SAMPLING
    Saegusa, Takumi
    Wellner, Jon A.
    [J]. ANNALS OF STATISTICS, 2013, 41 (01): : 269 - 295
  • [8] A quantile estimator under two-phase sampling for stratification
    Rueda, M.
    Munoz, J. F.
    Sanchez-Borrego, I.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2011, 88 (08) : 1565 - 1572
  • [9] Developing a new model for patient recruitment in mental health services: a cohort study using Electronic Health Records
    Callard, Felicity
    Broadbent, Matthew
    Denis, Mike
    Hotopf, Matthew
    Soncul, Murat
    Wykes, Til
    Lovestone, Simon
    Stewart, Robert
    [J]. BMJ OPEN, 2014, 4 (12):
  • [10] Asymptotic normality under two-phase sampling designs
    Chen, Jiahua
    Rao, J. N. K.
    [J]. STATISTICA SINICA, 2007, 17 (03) : 1047 - 1064