Model-assisted calibration of non-probability sample survey data using adaptive LASSO

被引：0

作者：

Chen, Jack Kuang Tsung ^{[1
]}

Valliant, Richard L. ^{[2
]}

Elliott, Michael R. ^{[3
,4
]}

机构：

[1] Survey Monkey Inc, Palo Alto, CA 94301 USA

[2] Univ Michigan, Inst Social Res, Survey Res Ctr, Ann Arbor, MI USA

[3] Univ Michigan, Sch Publ Hlth, Survey Res Ctr, Inst Social Res, Ann Arbor, MI 48109 USA

[4] Univ Michigan, Sch Publ Hlth, Dept Biostat, Ann Arbor, MI 48109 USA

来源：

SURVEY METHODOLOGY | 2018年 / 44卷 / 01期

关键词：

Adaptive LASSO estimators; Generalized regression estimator; Non-representative sample; Over-fitting; Variable selection; Oracle property; REGRESSION ESTIMATION; SELECTION;

D O I：

暂无

中图分类号：

O1 [数学]; C [社会科学总论];

学科分类号：

03 ; 0303 ; 0701 ; 070101 ;

摘要：

The probability-sampling-based framework has dominated survey research because it provides precise mathematical tools to assess sampling variability. However increasing costs and declining response rates are expanding the use of non-probability samples, particularly in general population settings, where samples of individuals pulled from web surveys are becoming increasingly cheap and easy to access. But non-probability samples are at risk for selection bias due to differential access, degrees of interest, and other factors. Calibration to known statistical totals in the population provide a means of potentially diminishing the effect of selection bias in non-probability samples. Here we show that model calibration using adaptive LASSO can yield a consistent estimator of a population total as long as a subset of the true predictors is included in the prediction model, thus allowing large numbers of possible covariates to be included without risk of overfilling. We show that the model calibration using adaptive LASSO provides improved estimation with respect to mean square error relative to standard competitors such as generalized regression (GREG) estimators when a large number of covariates are required to determine the true model, with effectively no loss in efficiency over GREG when smaller models will suffice. We also derive closed form variance estimators of population totals, and compare their behavior with bootstrap estimators. We conclude with a real world example using data from the National Health Interview Survey.

引用

页码：117 / 144

页数：28

共 50 条

[1] Model-assisted SCAD calibration for non-probability samples
Liu, Zhan
Tu, Chaofeng
Pan, Yingli
[J]. BRAZILIAN JOURNAL OF PROBABILITY AND STATISTICS, 2021, 35 (04) : 772 - 787
[2] Model-assisted calibration with SCAD to estimated control for non-probability samples
Liu, Zhan
Tu, Chaofeng
Pan, Yingli
[J]. STATISTICAL METHODS AND APPLICATIONS, 2022, 31 (04): : 849 - 879
[3] Model-assisted calibration with SCAD to estimated control for non-probability samples
Zhan Liu
Chaofeng Tu
Yingli Pan
[J]. Statistical Methods & Applications, 2022, 31 : 849 - 879
[4] A new technique for handling non-probability samples based on model-assisted kernel weighting
Cobo, Beatriz
Rueda-Sanchez, Jorge Luis
Ferri-Garcia, Ramon
Rueda, Maria del Mar
[J]. MATHEMATICS AND COMPUTERS IN SIMULATION, 2025, 227 : 272 - 281
[5] A Method to Combine Non-probability Sample Data with Probability Sample Data in Estimating Spatial Means of Environmental Variables
D. J. Brus
J. J. de Gruijter
[J]. Environmental Monitoring and Assessment, 2003, 83 : 303 - 317
[6] A method to combine non-probability sample data with probability sample data in estimating spatial means of environmental variables
Brus, DJ
De Gruijter, JJ
[J]. ENVIRONMENTAL MONITORING AND ASSESSMENT, 2003, 83 (03) : 303 - 317
[7] Comments on "Statistical inference with non-probability survey samples" - Miniaturizing data defect correlation: A versatile strategy for handling non-probability samples
Meng, Xiao-Li
[J]. SURVEY METHODOLOGY, 2022, 48 (02) : 339 - 360
[8] Nonparametric additive model-assisted estimation for survey data
Wang, Li
Wang, Suojin
[J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2011, 102 (07) : 1126 - 1140
[9] Calibrating non-probability surveys to estimated control totals using LASSO, with an application to political polling
Chen, Jack Kuang Tsung
Valliant, Richard L.
Elliott, Michael R.
[J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2019, 68 (03) : 657 - 681
[10] Inference with non-probability samples and survey data integration: a science mapping study
Camilla Salvatore
[J]. METRON, 2023, 81 : 83 - 107

← 1 2 3 4 5 →