Robust estimation of loss-based measures of model performance under covariate shift

被引:0
|
作者
Morrison, Samantha [1 ]
Gatsonis, Constantine [1 ]
Dahabreh, Issa J. [2 ,3 ]
Li, Bing [4 ]
Steingrimsson, Jon A. [4 ]
机构
[1] Brown Univ, Dept Biostat, Providence, RI USA
[2] Harvard TH Chan Sch Publ Hlth, CAUSALab, Boston, MA USA
[3] Harvard TH Chan Sch Publ Hlth, Dept Epidemiol, Boston, MA USA
[4] Harvard TH Chan Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
关键词
Covariate shift; domain adaptation; double robustness; MSE; transportability;
D O I
10.1002/cjs.11815
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We present methods for estimating loss-based measures of the performance of a prediction model in a target population that differs from the source population in which the model was developed, in settings where outcome and covariate data are available from the source population but only covariate data are available on a simple random sample from the target population. Prior work adjusting for differences between the two populations has used various weighting estimators with inverse odds or density ratio weights. Here, we develop more robust estimators for the target population risk (expected loss) that can be used with data-adaptive (e.g., machine learning-based) estimation of nuisance parameters. We examine the large-sample properties of the estimators and evaluate finite-sample performance in simulations. Last, we apply the methods to data from lung cancer screening using nationally representative data from the National Health and Nutrition Examination Survey (NHANES) and extend our methods to account for the complex survey design of the NHANES. Dans cette & eacute;tude, les auteurs pr & eacute;sentent des m & eacute;thodes visant & agrave; estimer les mesures de performance bas & eacute;es sur la fonction de perte d'un mod & egrave;le pr & eacute;dictif, lorsque la population cible diff & egrave;re de la population source. Le contexte consid & eacute;r & eacute; est celui o & ugrave; seules les donn & eacute;es de covariables sont disponibles sur un & eacute;chantillon al & eacute;atoire simple de la population cible, tandis que les donn & eacute;es de r & eacute;ponse et covariables le sont pour la population source. Contrairement aux approches ant & eacute;rieures qui ajustent les diff & eacute;rences entre les populations en utilisant des estimateurs de pond & eacute;ration avec des poids de rapports de cotes inverses ou de rapports de densit & eacute;, cette & eacute;tude propose des estimateurs robustes du risque (perte moyenne) dans la population cible. Ces estimateurs peuvent & ecirc;tre associ & eacute;s & agrave; des techniques d'estimation adaptatives aux donn & eacute;es, telles que l'apprentissage statistique, pour les param & egrave;tres nuisibles. Les propri & eacute;t & eacute;s asymptotiques des estimateurs propos & eacute;s sont & eacute;tudi & eacute;es th & eacute;oriquement, et leur comportement & agrave; taille finie est & eacute;valu & eacute; par simulations. L'application empirique porte sur des donn & eacute;es de d & eacute;pistage du cancer du poumon issues de l'enqu & ecirc;te "National Health and Nutrition Examination Survey" (NHANES), repr & eacute;sentative de la population am & eacute;ricaine. De plus, une extension permettant de tenir compte du plan de sondage complexe du NHANES est propos & eacute;e.
引用
收藏
页数:16
相关论文
共 50 条
  • [11] Robust exponential squared loss-based estimation in semi-functional linear regression models
    Yu, Ping
    Zhu, Zhongyi
    Zhang, Zhongzhan
    [J]. COMPUTATIONAL STATISTICS, 2019, 34 (02) : 503 - 525
  • [12] Robust exponential squared loss-based estimation in semi-functional linear regression models
    Ping Yu
    Zhongyi Zhu
    Zhongzhan Zhang
    [J]. Computational Statistics, 2019, 34 : 503 - 525
  • [13] Coherent and convex loss-based risk measures for portfolio vectors
    Yanhong Chen
    Fei Sun
    Yijun Hu
    [J]. Positivity, 2018, 22 : 399 - 414
  • [14] Coherent and convex loss-based risk measures for portfolio vectors
    Chen, Yanhong
    Sun, Fei
    Hu, Yijun
    [J]. POSITIVITY, 2018, 22 (01) : 399 - 414
  • [15] Doubly robust inference for targeted minimum loss-based estimation in randomized trials with missing outcome data
    Diaz, Ivan
    van der Laan, Mark J.
    [J]. STATISTICS IN MEDICINE, 2017, 36 (24) : 3807 - 3819
  • [16] Dangers of Bayesian Model Averaging under Covariate Shift
    Izmailov, Pavel
    Nicholson, Patrick
    Lotfi, Sanae
    Wilson, Andrew Gordon
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [17] Balancing Score Adjusted Targeted Minimum Loss-based Estimation
    Lendle, Samuel David
    Fireman, Bruce
    van der Laan, Mark J.
    [J]. JOURNAL OF CAUSAL INFERENCE, 2015, 3 (02) : 139 - 155
  • [18] An energy loss-based vehicular injury severity model
    Ji, Ang
    Levinson, David
    [J]. ACCIDENT ANALYSIS AND PREVENTION, 2020, 146
  • [19] Asymptotics of the loss-based tail risk measures in the presence of extreme risks
    Liu, Jiajun
    Shushi, Tomer
    [J]. EUROPEAN ACTUARIAL JOURNAL, 2024, 14 (01) : 205 - 224
  • [20] Input-dependent estimation of generalization error under covariate shift
    Sugiyama, Masashi
    Mueller, Klaus-Robert
    [J]. STATISTICS & RISK MODELING, 2005, 23 (04) : 249 - 279