A latent variable model approach to estimating systematic bias in the oversampling method

被引:22
|
作者
Hauner, Katherina K. [1 ,2 ]
Zinbarg, Richard E. [1 ,3 ]
Revelle, William [1 ]
机构
[1] Northwestern Univ, Dept Psychol, Evanston, IL USA
[2] Northwestern Univ, Dept Neurol, Chicago, IL 60611 USA
[3] Northwestern Univ, Family Inst, Evanston, IL USA
关键词
Sampling; statistical bias; latent variable modeling; EXTREME GROUPS APPROACH; LOW COGNITIVE RISK; REGRESSION-ANALYSIS; STATISTICAL POWER; DISORDERS; EVENTS; DICHOTOMIZATION; INDIVIDUALS; DEPRESSION; PREVALENCE;
D O I
10.3758/s13428-013-0402-6
中图分类号
B841 [心理学研究方法];
学科分类号
040201 ;
摘要
The method of oversampling data from a preselected range of a variable's distribution is often applied by researchers who wish to study rare outcomes without substantially increasing sample size. Despite frequent use, however, it is not known whether this method introduces statistical bias due to disproportionate representation of a particular range of data. The present study employed simulated data sets to examine how oversampling introduces systematic bias in effect size estimates (of the relationship between oversampled predictor variables and the outcome variable), as compared with estimates based on a random sample. In general, results indicated that increased oversampling was associated with a decrease in the absolute value of effect size estimates. Critically, however, the actual magnitude of this decrease in effect size estimates was nominal. This finding thus provides the first evidence that the use of the oversampling method does not systematically bias results to a degree that would typically impact results in behavioral research. Examining the effect of sample size on oversampling yielded an additional important finding: For smaller samples, the use of oversampling may be necessary to avoid spuriously inflated effect sizes, which can arise when the number of predictor variables and rare outcomes is comparable.
引用
收藏
页码:786 / 797
页数:12
相关论文
共 50 条
  • [31] A latent variable model for chemogenomic profiling
    Flaherty, P
    Giaever, G
    Kumm, J
    Jordan, MI
    Arkin, AP
    [J]. BIOINFORMATICS, 2005, 21 (15) : 3286 - 3293
  • [32] Latent variable model for cluster ensemble
    Wang, Hong-Jun
    Li, Zhi-Shu
    Cheng, Yang
    Zhou, Peng
    Zhou, Wei
    [J]. Ruan Jian Xue Bao/Journal of Software, 2009, 20 (04): : 825 - 833
  • [33] A latent variable model for multivariate discretization
    Monti, S
    Cooper, GF
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS 99, PROCEEDINGS, 1999, : 249 - 254
  • [34] Nonparametric estimation of a latent variable model
    Kelava, Augustin
    Kohler, Michael
    Krzyzak, Adam
    Schaffland, Tim Fabian
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2017, 154 : 112 - 134
  • [35] A latent variable model for ordinal variables
    Moustaki, I
    [J]. APPLIED PSYCHOLOGICAL MEASUREMENT, 2000, 24 (03) : 211 - 223
  • [36] A LATENT VARIABLE MODEL OF ASSORTATIVE MATING
    NEALE, MC
    [J]. BEHAVIOR GENETICS, 1989, 19 (06) : 770 - 771
  • [37] A latent variable model for multidimensional unfolding
    Adachi, K
    [J]. NEW DEVELOPMENTS IN PSYCHOMETRICS, 2003, : 503 - 510
  • [38] Generalized threshold latent variable model
    Li, Yuanbo
    Zheng, Xunze
    Yau, Chun Yip
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2019, 13 (01): : 2043 - 2092
  • [39] Estimating Party Positions across Countries and Time-A Dynamic Latent Variable Model for Manifesto Data
    Koenig, Thomas
    Marbach, Moritz
    Osnabruegge, Moritz
    [J]. POLITICAL ANALYSIS, 2013, 21 (04) : 468 - 491
  • [40] A Bayesian hierarchical latent trait model for estimating rater bias and reliability in large-scale performance assessment
    Zupane, Kaja
    Strumbelji, Erik
    [J]. PLOS ONE, 2018, 13 (04):