Flexible propensity score estimation strategies for clustered data in observational studies

被引:7
|
作者
Chang, Ting-Hsuan [1 ]
Trang Quynh Nguyen [2 ]
Lee, Youjin [3 ]
Jackson, John W. [1 ,2 ,4 ]
Stuart, Elizabeth A. [2 ,4 ,5 ]
机构
[1] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Epidemiol, Baltimore, MD 21205 USA
[2] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Mental Hlth, 624 N Broadway,Room 804, Baltimore, MD 21205 USA
[3] Brown Univ, Dept Biostat, Providence, RI 02912 USA
[4] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Biostat, Baltimore, MD 21205 USA
[5] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Hlth Policy & Management, Baltimore, MD 21205 USA
关键词
clustering; machine learning; observational studies; propensity score weighting; unmeasured confounder; BOOSTED REGRESSION; GUIDE; BIAS;
D O I
10.1002/sim.9551
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Existing studies have suggested superior performance of nonparametric machine learning over logistic regression for propensity score estimation. However, it is unclear whether the advantages of nonparametric propensity score modeling are carried to settings where there is clustering of individuals, especially when there is unmeasured cluster-level confounding. In this work we examined the performance of logistic regression (all main effects), Bayesian additive regression trees and generalized boosted modeling for propensity score weighting in clustered settings, with the clustering being accounted for by including either cluster indicators or random intercepts. We simulated data for three hypothetical observational studies of varying sample and cluster sizes. Confounders were generated at both levels, including a cluster-level confounder that is unobserved in the analyses. A binary treatment and a continuous outcome were generated based on seven scenarios with varying relationships between the treatment and confounders (linear and additive, nonlinear/nonadditive, nonadditive with the unobserved cluster-level confounder). Results suggest that when the sample and cluster sizes are large, nonparametric propensity score estimation may provide better covariate balance, bias reduction, and 95% confidence interval coverage, regardless of the degree of nonlinearity or nonadditivity in the true propensity score model. When the sample or cluster sizes are small, however, nonparametric approaches may become more vulnerable to unmeasured cluster-level confounding and thus may not be a better alternative to multilevel logistic regression. We applied the methods to the National Longitudinal Study of Adolescent to Adult Health data, estimating the effect of team sports participation during adolescence on adulthood depressive symptoms.
引用
收藏
页码:5016 / 5032
页数:17
相关论文
共 50 条
  • [1] Propensity score methods for observational studies with clustered data: A review
    Chang, Ting-Hsuan
    Stuart, Elizabeth A.
    [J]. STATISTICS IN MEDICINE, 2022, 41 (18) : 3612 - 3626
  • [2] Bayesian propensity score analysis for clustered observational data
    Qi Zhou
    Catherine McNeal
    Laurel A. Copeland
    Justin P. Zachariah
    Joon Jin Song
    [J]. Statistical Methods & Applications, 2020, 29 : 335 - 355
  • [3] Bayesian propensity score analysis for clustered observational data
    Zhou, Qi
    McNeal, Catherine
    Copeland, Laurel A.
    Zachariah, Justin P.
    Song, Joon Jin
    [J]. STATISTICAL METHODS AND APPLICATIONS, 2020, 29 (02): : 335 - 355
  • [4] Bayesian misclassification and propensity score methods for clustered observational studies
    Zhou, Qi
    Chin, Yoo-Mi
    Stamey, James D.
    Song, Joon Jin
    [J]. JOURNAL OF APPLIED STATISTICS, 2018, 45 (09) : 1547 - 1560
  • [5] Propensity score modeling strategies for the causal analysis of observational data
    Hullsiek, KH
    Louis, TA
    [J]. BIOSTATISTICS, 2002, 3 (02) : 179 - 193
  • [6] Parametric and nonparametric propensity score estimation in multilevel observational studies
    Salditt, Marie
    Nestler, Steffen
    [J]. STATISTICS IN MEDICINE, 2023, 42 (23) : 4147 - 4176
  • [7] Propensity score matching with clustered data. An application to the estimation of the impact of caesarean section on the Apgar score
    Arpino, Bruno
    Cannas, Massimo
    [J]. STATISTICS IN MEDICINE, 2016, 35 (12) : 2074 - 2091
  • [9] An overview of propensity score matching methods for clustered data
    Langworthy, Benjamin
    Wu, Yujie
    Wang, Molin
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2023, 32 (04) : 641 - 655
  • [10] Propensity score estimation with boosted regression for evaluating causal effects in observational studies
    McCaffrey, DF
    Ridgeway, G
    Morral, AR
    [J]. PSYCHOLOGICAL METHODS, 2004, 9 (04) : 403 - 425