Tight bounds for the probability of overfitting

被引:3
|
作者
Vorontsov, K. V. [1 ]
机构
[1] Russian Acad Sci, Dorodnicyn Comp Ctr, Moscow 119333, Russia
基金
俄罗斯基础研究基金会;
关键词
DOKLADY Mathematic; Error Vector; Error Frequency; Tight Bound; Weak Probability;
D O I
10.1134/S1064562409060032
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
A combinatorial approach is developed that leads to tight bounds for the probability of overfitting in a number of special cases. The Vapnik Chervonenkis classical bound is easy to restate under the weak probabillity assumptions where △ is the diversity coefficient of A, which is equal to the number of different error vectors generated by all possible algorithms a from A. An experimental analysis of major causes of overestimated bound shows that the probability of over fitting depends substantially not only on the number of different error vectors but also on the degree of their difference. The set A may contain a large number of pairs of similar algorithms. Specifically, most classification algorithms used in practice have a separating surface that is continuous with respect to the parameters.
引用
收藏
页码:793 / 796
页数:4
相关论文
共 50 条
  • [1] Tight bounds for the probability of overfitting
    K. V. Vorontsov
    [J]. Doklady Mathematics, 2009, 80 : 793 - 796
  • [2] Cover-based combinatorial bounds on probability of overfitting
    A. I. Frey
    I. O. Tolstikhin
    [J]. Doklady Mathematics, 2014, 89 : 185 - 187
  • [3] Cover-based combinatorial bounds on probability of overfitting
    Frey, A. I.
    Tolstikhin, I. O.
    [J]. DOKLADY MATHEMATICS, 2014, 89 (02) : 185 - 187
  • [4] Exact Combinatorial Bounds on the Probability of Overfitting for Empirical Risk Minimization
    Vorontsov K.V.
    [J]. Pattern Recognition and Image Analysis, 2010, 20 (03) : 269 - 285
  • [5] TIGHT PROBABILITY BOUNDS WITH PAIRWISE INDEPENDENCE
    Ramachandra, Arjun Kodagehalli
    Natarajan, Karthik
    [J]. SIAM JOURNAL ON DISCRETE MATHEMATICS, 2023, 37 (02) : 516 - 555
  • [6] The probability of backtest overfitting
    Bailey, David H.
    Borwein, Jonathan M.
    de Prado, Marcos Lopez
    Zhu, Qiji Jim
    [J]. JOURNAL OF COMPUTATIONAL FINANCE, 2017, 20 (04) : 39 - 69
  • [8] Arbitrarily tight upper and lower bounds on the Bayesian probability of error
    AviItzhak, H
    Diep, T
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1996, 18 (01) : 89 - 91
  • [9] COMBINATORIAL BOUNDS OF OVERFITTING FOR THRESHOLD CLASSIFIERS
    Ishkina, Sh Kh
    [J]. UFA MATHEMATICAL JOURNAL, 2018, 10 (01): : 49 - 63
  • [10] Tight tail probability bounds for distribution-free decision making
    Roos, Ernst
    Brekelmans, Ruud
    van Eekelen, Wouter
    den Hertog, Dick
    van Leeuwaarden, Johan S. H.
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2022, 299 (03) : 931 - 944