Evaluation of several nonparametric bootstrap methods to estimate confidence intervals for software metrics

被引:32
|
作者
Lei, S
Smith, MR
机构
[1] Gen Dynam Canada, Calgary, AB T2E 8P2, Canada
[2] Univ Calgary, Dept Elect & Comp Engn, Calgary, AB T2N 1N4, Canada
关键词
Efron bootstrap; software metrics; confidence intervals; correction of possible biases in Efron bootstrap estimates;
D O I
10.1109/TSE.2003.1245301
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Sample statistics and model parameters can be used to infer the properties, or characteristics, of the underlying population in typical data-analytic situations. Confidence intervals can provide an estimate of the range within which the true value of the statistic lies. A narrow confidence interval implies low variability of the statistic, justifying a strong conclusion made from the analysis. Many statistics used in software metrics analysis do not come with theoretical formulas to allow such accuracy assessment. The Efron bootstrap statistical analysis appears to address this weakness. In this paper, we present an empirical. analysis of the reliability of several Efron nonparametric bootstrap methods in assessing the accuracy of sample statistics in the context of software metrics. A brief review on the basic concept of various methods available for the estimation of statistical errors is provided, with the stated advantages of the Efron bootstrap discussed. Validations of several different bootstrap algorithms are performed across basic software metrics in both simulated and industrial software engineering contexts. It was found that the 90 percent confidence intervals for mean, median, and Spearman correlation coefficients were accurately predicted. The 90 percent confidence intervals for the variance and Pearson correlation coefficients were typically underestimated (60-70 percent confidence interval), and those for skewness and kurtosis overestimated (98-100 percent confidence interval). It was found that the Bias-corrected and accelerated bootstrap approach gave the most consistent confidence intervals, but its accuracy depended on the metric examined. A method for correcting the under-/ overestimation of bootstrap confidence intervals for small data sets is suggested, but the success of the approach was found to be inconsistent across the tested metrics.
引用
收藏
页码:996 / 1004
页数:9
相关论文
共 50 条
  • [1] Evaluation of several Efron bootstrap methods to estimate error measures for software metrics
    Lei, S
    Smith, M
    [J]. IEEE CCEC 2002: CANADIAN CONFERENCE ON ELECTRCIAL AND COMPUTER ENGINEERING, VOLS 1-3, CONFERENCE PROCEEDINGS, 2002, : 703 - 708
  • [2] ON BOOTSTRAP CONFIDENCE-INTERVALS IN NONPARAMETRIC REGRESSION
    HALL, P
    [J]. ANNALS OF STATISTICS, 1992, 20 (02): : 695 - 711
  • [3] Nonparametric confidence intervals based on extreme bootstrap percentiles
    Lee, SMS
    [J]. STATISTICA SINICA, 2000, 10 (02) : 475 - 496
  • [4] Predicting Software Reliability Growth Using Bootstrap Nonparametric Fixed-width Confidence Intervals
    Dharmasena, L. Sandamali
    Zeephongsekul, P.
    [J]. 15TH ISSAT INTERNATIONAL CONFERENCE ON RELIABILITY AND QUALITY IN DESIGN, PROCEEDINGS, 2009, : 362 - 366
  • [5] Bootstrap confidence intervals in functional nonparametric regression under dependence
    Rana, Paula
    Aneiros, German
    Vilar, Juan
    Vieu, Philippe
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2016, 10 (02): : 1973 - 1999
  • [7] Small sample properties of nonparametric bootstrap t confidence intervals
    Porter, PS
    Rao, ST
    Ku, JY
    Poirot, RL
    Dakins, M
    [J]. JOURNAL OF THE AIR & WASTE MANAGEMENT ASSOCIATION, 1997, 47 (11): : 1197 - 1203
  • [8] NONPARAMETRIC CONFIDENCE-INTERVALS FOR FUNCTIONS OF SEVERAL DISTRIBUTIONS
    WITHERS, CS
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 1988, 40 (04) : 727 - 746
  • [9] BOOTSTRAP METHODS - A REVIEW OF BOOTSTRAP CONFIDENCE-INTERVALS - DISCUSSION
    KENT, JT
    DAVISON, AC
    SILVERMAN, BW
    YOUNG, GA
    DANIELS, HE
    TONG, H
    GARTHWAITE, PH
    BUCKLAND, ST
    BERAN, R
    HALL, P
    KOSLOW, S
    STEWART, DW
    TIBSHIRANI, RJ
    TITTERINGTON, DM
    VERRALL, RJ
    WYNN, HP
    WU, CFJ
    HINKLEY, D
    DICICCIO, TJ
    ROMANO, JP
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1988, 50 (03): : 355 - 370
  • [10] NONPARAMETRIC BOOTSTRAP CONFIDENCE-INTERVALS FOR DISCRETE REGRESSION-FUNCTIONS
    RODRIGUEZCAMPOS, MC
    CAOABAD, R
    [J]. JOURNAL OF ECONOMETRICS, 1993, 58 (1-2) : 207 - 222