A comparison of methods for simulating correlated binary variables with specified marginal means and correlations

被引:15
|
作者
Preisser, John S. [1 ]
Qaqish, Bahjat F. [1 ]
机构
[1] Univ N Carolina, Dept Biostat, Gillings Sch Global Publ Hlth, Chapel Hill, NC 27599 USA
关键词
binomial; correlation; Frechet bounds; generalized estimating equations; multivariate binary; overdispersion; GENERALIZED ESTIMATING EQUATIONS; LONGITUDINAL DATA; CORRELATION BOUNDS; QUASI-LIKELIHOOD; MODELS; REGRESSION;
D O I
10.1080/00949655.2013.818148
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Simulation studies employed to study properties of estimators for parameters in population-average models for clustered or longitudinal data require suitable algorithms for data generation. Methods for generating correlated binary data that allow general specifications of the marginal mean and correlation structures are particularly useful. We compare an algorithm based on dichotomizing multi-normal variates to one based on a conditional linear family (CLF) of distributions [Qaqish BF. A family of multivariate binary distributions for simulating correlated binary variables with specified marginal means and correlations. Biometrika. 2003;90:455-463] with respect to range restrictions induced on correlations. Examples include generating longitudinal binary data and generating correlated binary data compatible with specified marginal means and covariance structures for bivariate, overdispersed binomial outcomes. Results show the CLF method gives a wider range of correlations for longitudinal data having autocorrelated within-subject associations, while the multivariate probit method gives a wider range of correlations for clustered data having exchangeable-type correlations. In the case of a decaying-product correlation structure, it is shown that the CLF method achieves the nonparametric limits on the range of correlations, which cannot be surpassed by any method.
引用
收藏
页码:2441 / 2452
页数:12
相关论文
共 11 条