Robust estimation of loss-based measures of model performance under covariate shift

被引:0
|
作者
Morrison, Samantha [1 ]
Gatsonis, Constantine [1 ]
Dahabreh, Issa J. [2 ,3 ]
Li, Bing [4 ]
Steingrimsson, Jon A. [4 ]
机构
[1] Brown Univ, Dept Biostat, Providence, RI USA
[2] Harvard TH Chan Sch Publ Hlth, CAUSALab, Boston, MA USA
[3] Harvard TH Chan Sch Publ Hlth, Dept Epidemiol, Boston, MA USA
[4] Harvard TH Chan Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
关键词
Covariate shift; domain adaptation; double robustness; MSE; transportability;
D O I
10.1002/cjs.11815
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We present methods for estimating loss-based measures of the performance of a prediction model in a target population that differs from the source population in which the model was developed, in settings where outcome and covariate data are available from the source population but only covariate data are available on a simple random sample from the target population. Prior work adjusting for differences between the two populations has used various weighting estimators with inverse odds or density ratio weights. Here, we develop more robust estimators for the target population risk (expected loss) that can be used with data-adaptive (e.g., machine learning-based) estimation of nuisance parameters. We examine the large-sample properties of the estimators and evaluate finite-sample performance in simulations. Last, we apply the methods to data from lung cancer screening using nationally representative data from the National Health and Nutrition Examination Survey (NHANES) and extend our methods to account for the complex survey design of the NHANES. Dans cette & eacute;tude, les auteurs pr & eacute;sentent des m & eacute;thodes visant & agrave; estimer les mesures de performance bas & eacute;es sur la fonction de perte d'un mod & egrave;le pr & eacute;dictif, lorsque la population cible diff & egrave;re de la population source. Le contexte consid & eacute;r & eacute; est celui o & ugrave; seules les donn & eacute;es de covariables sont disponibles sur un & eacute;chantillon al & eacute;atoire simple de la population cible, tandis que les donn & eacute;es de r & eacute;ponse et covariables le sont pour la population source. Contrairement aux approches ant & eacute;rieures qui ajustent les diff & eacute;rences entre les populations en utilisant des estimateurs de pond & eacute;ration avec des poids de rapports de cotes inverses ou de rapports de densit & eacute;, cette & eacute;tude propose des estimateurs robustes du risque (perte moyenne) dans la population cible. Ces estimateurs peuvent & ecirc;tre associ & eacute;s & agrave; des techniques d'estimation adaptatives aux donn & eacute;es, telles que l'apprentissage statistique, pour les param & egrave;tres nuisibles. Les propri & eacute;t & eacute;s asymptotiques des estimateurs propos & eacute;s sont & eacute;tudi & eacute;es th & eacute;oriquement, et leur comportement & agrave; taille finie est & eacute;valu & eacute; par simulations. L'application empirique porte sur des donn & eacute;es de d & eacute;pistage du cancer du poumon issues de l'enqu & ecirc;te "National Health and Nutrition Examination Survey" (NHANES), repr & eacute;sentative de la population am & eacute;ricaine. De plus, une extension permettant de tenir compte du plan de sondage complexe du NHANES est propos & eacute;e.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Loss-based risk measures
    Cont, Rama
    Deguest, Romain
    He, Xue Dong
    [J]. STATISTICS & RISK MODELING, 2013, 30 (02) : 133 - 167
  • [2] Robust Fairness under Covariate Shift
    Rezaei, Ashkan
    Liu, Anqi
    Memarrast, Omid
    Ziebart, Brian D.
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9419 - 9427
  • [3] Distance Matters For Improving Performance Estimation Under Covariate Shift
    Roschewitz, Melanie
    Glocker, Ben
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 4551 - 4561
  • [4] On Regularization Parameter Estimation under Covariate Shift
    Kouw, Wouter M.
    Loog, Marco
    [J]. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 426 - 431
  • [5] Model selection under covariate shift
    Sugiyama, M
    Müller, KR
    [J]. ARTIFICIAL NEURAL NETWORKS: FORMAL MODELS AND THEIR APPLICATIONS - ICANN 2005, PT 2, PROCEEDINGS, 2005, 3697 : 235 - 240
  • [6] Set-valued loss-based risk measures
    Sun, Fei
    Chen, Yanhong
    Hu, Yijun
    [J]. POSITIVITY, 2018, 22 (03) : 859 - 871
  • [7] Set-valued loss-based risk measures
    Fei Sun
    Yanhong Chen
    Yijun Hu
    [J]. Positivity, 2018, 22 : 859 - 871
  • [8] Robust Learning under Uncertain Test Distributions: Relating Covariate Shift to Model Misspecification
    Wen, Junfeng
    Yu, Chun-Nam
    Greiner, Russell
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 631 - 639
  • [9] Doubly robust calibration of prediction sets under covariate shift
    Yang, Yachong
    Kuchibhotla, Arun Kumar
    Tchetgen, Eric Tchetgen
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2024, 86 (04) : 943 - 965
  • [10] Robust Classification under Covariate Shift with Application to Active Learning
    Liu, Anqi
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 4307 - 4308