Assessing model fit by cross-validation

被引:623
|
作者
Hawkins, DM [1 ]
Basak, SC
Mills, D
机构
[1] Univ Minnesota, Sch Stat, Minneapolis, MN 55455 USA
[2] Univ Minnesota, Nat Resources Res Inst, Duluth, MN 55811 USA
关键词
D O I
10.1021/ci025626i
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
When QSAR models are fitted, it is important to validate any fitted model-to check that it is plausible that its predictions will carry over to fresh data not used in the model fitting exercise. There are two standard ways of doing this-using a separate hold-out test sample and the computationally much more burdensome leave-one-out cross-validation in which the entire pool of available compounds is used both to fit the model and to assess its validity. We show by theoretical argument and empiric study of a large QSAR data set that when the available sample size is small-in the dozens or scores rather than the hundreds, holding a portion of it back for testing is wasteful, and that it is much better to use cross-validation, but ensure that this is done properly.
引用
收藏
页码:579 / 586
页数:8
相关论文
共 50 条
  • [1] ASSESSING FORECAST SKILL THROUGH CROSS-VALIDATION
    ELSNER, JB
    SCHMERTMANN, CP
    [J]. WEATHER AND FORECASTING, 1994, 9 (04) : 619 - 624
  • [2] Cross-validation is dead. Long live cross-validation! Model validation based on resampling
    Knut Baumann
    [J]. Journal of Cheminformatics, 2 (Suppl 1)
  • [3] Nonparametric Goodness of Fit via Cross-Validation Bayes Factors
    Hart, Jeffrey D.
    Choi, Taeryon
    [J]. BAYESIAN ANALYSIS, 2017, 12 (03): : 653 - 677
  • [4] Linear model selection by cross-validation
    Rao, CR
    Wu, Y
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2005, 128 (01) : 231 - 240
  • [5] On Cross-Validation for MLP Model Evaluation
    Karkkainen, Tommi
    [J]. STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2014, 8621 : 291 - 300
  • [6] Robust linear model selection by cross-validation
    Ronchetti, E
    Field, C
    Blanchard, W
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1997, 92 (439) : 1017 - 1023
  • [7] A survey of cross-validation procedures for model selection
    Arlot, Sylvain
    Celisse, Alain
    [J]. STATISTICS SURVEYS, 2010, 4 : 40 - 79
  • [8] MODEL-STRUCTURE SELECTION BY CROSS-VALIDATION
    STOICA, P
    EYKHOFF, P
    JANSSEN, P
    SODERSTROM, T
    [J]. INTERNATIONAL JOURNAL OF CONTROL, 1986, 43 (06) : 1841 - 1878
  • [9] GOODNESS-OF-FIT PATTERNS IN A COMPUTER CROSS-VALIDATION PROCEDURE COMPARING A LINEAR AND A THRESHOLD-MODEL
    COLLYER, CE
    [J]. BEHAVIOR RESEARCH METHODS INSTRUMENTS & COMPUTERS, 1986, 18 (06): : 618 - 622
  • [10] On Estimating Model in Feature Selection With Cross-Validation
    Qi, Chunxia
    Diao, Jiandong
    Qiu, Like
    [J]. IEEE ACCESS, 2019, 7 : 33454 - 33463