Confidence Interval Estimation of Predictive Performance in the Context of AutoML

被引:0
|
作者
Paraschakis, Konstantinos [1 ]
Castellani, Andrea [4 ]
Borboudakis, Giorgos [1 ]
Tsamardinos, Ioannis [1 ,2 ,3 ]
机构
[1] JADBio Gnosis DA SA, Iraklion 70013, Crete, Greece
[2] FORTH, Inst Appl & Computat Math, Iraklion 70013, Crete, Greece
[3] Univ Crete, Dept Comp Sci, Iraklion 70013, Crete, Greece
[4] Honda Res Inst Europe GmbH, D-63073 Offenbach, Germany
来源
INTERNATIONAL CONFERENCE ON AUTOMATED MACHINE LEARNING | 2024年 / 256卷
关键词
AREAS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Any supervised machine learning analysis is required to provide an estimate of the out-of-sample predictive performance. However, it is imperative to also provide a quantification of the uncertainty of this performance in the form of a confidence or credible interval (CI) and not just a point estimate. In an AutoML setting, estimating the CI is challenging due to the "winner's curse", i.e., the bias of estimation due to cross-validating several machine learning pipelines and selecting the winning one. In this work, we perform a comparative evaluation of 9 state-of-the-art methods and variants in CI estimation in an AutoML setting on a corpus of real and simulated datasets. The methods are compared in terms of inclusion percentage (does a 95% CI include the true performance at least 95% of the time), CI tightness (tighter CIs are preferable as being more informative), and execution time. The evaluation is the first one that covers most, if not all, such methods and extends previous work to imbalanced and small-sample tasks. In addition, we present a variant, called BBC-F, of an existing method (the Bootstrap Bias Correction, or BBC) that maintains the statistical properties of the BBC but is more computationally efficient. The results support that BBC-F and BBC dominate the other methods in all metrics measured.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] A Fuzzy Predictive Model with Confidence Interval Estimation for Alloy Property Assessment
    Chen, Minyou
    Yang, Simon X.
    Luo, Ciyong
    2008 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-11, 2008, : 2726 - 2731
  • [2] SIMULTANEOUS CONFIDENCE INTERVAL ESTIMATION
    BOSE, RC
    ROY, SN
    ANNALS OF MATHEMATICAL STATISTICS, 1953, 24 (01): : 144 - 144
  • [3] SIMULTANEOUS CONFIDENCE INTERVAL ESTIMATION
    ROY, SN
    BOSE, RC
    ANNALS OF MATHEMATICAL STATISTICS, 1953, 24 (04): : 513 - 536
  • [4] Oracle AutoML: A Fast and Predictive AutoML Pipeline
    Yakovlev, Anatoly
    Moghadam, Hesam Fathi
    Moharrer, Ali
    Cai, Jingxiao
    Chavoshi, Nikan
    Varadarajan, Venkatanathan
    Agrawal, Sandeep R.
    Idicula, Sam
    Karnagel, Tomas
    Jinturkar, Sanjay
    Agarwal, Nipun
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (12): : 3166 - 3180
  • [5] A confidence interval estimation for the number of signals
    Chen, PY
    Wicks, MC
    RADAR 2002, 2002, (490): : 344 - 348
  • [6] On confidence interval estimation of normal percentiles
    Zili Zhang
    Saralees Nadarajah
    Japanese Journal of Statistics and Data Science, 2018, 1 (2) : 373 - 391
  • [7] CONFIDENCE-INTERVAL ESTIMATION OF INTERACTION
    HOSMER, DW
    LEMESHOW, S
    EPIDEMIOLOGY, 1992, 3 (05) : 452 - 456
  • [8] Confidence interval estimation of a normal percentile
    Chakraborti, S.
    Li, J.
    AMERICAN STATISTICIAN, 2007, 61 (04): : 331 - 336
  • [9] Confidence interval estimation of a normal percentile
    Nadarajah, Saralees
    AMERICAN STATISTICIAN, 2008, 62 (02): : 186 - 187
  • [10] On confidence interval estimation of normal percentiles
    Zhang, Zili
    Nadarajah, Saralees
    JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE, 2018, 1 (02) : 373 - 391