On the Upper Bounds of the Real-Valued Predictions

被引:15
|
作者
Benevenuta, Silvia [1 ]
Fariselli, Piero [1 ]
机构
[1] Univ Turin, Dept Med Sci, Via Santena 19, I-10123 Turin, Italy
来源
关键词
Upper bound; free energy; machine learning; regression; prediction;
D O I
10.1177/1177932219871263
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Predictions are fundamental in science as they allow to test and falsify theories. Predictions are ubiquitous in bioinformatics and also help when no first principles are available. Predictions can be distinguished between classifications (when we associate a label to a given input) or regression (when a real value is assigned). Different scores are used to assess the performance of regression predictors; the most widely adopted include the mean square error, the Pearson correlation (rho), and the coefficient of determination (or R-2). The common conception related to the last 2 indices is that the theoretical upper bound is 1; however, their upper bounds depend both on the experimental uncertainty and the distribution of target variables. A narrow distribution of the target variable may induce a low upper bound. The knowledge of the theoretical upper bounds also has 2 practical applications: (1) comparing different predictors tested on different data sets may lead to wrong ranking and (2) performances higher than the theoretical upper bounds indicate overtraining and improper usage of the learning data sets. Here, we derive the upper bound for the coefficient of determination showing that it is lower than that of the square of the Pearson correlation. We provide analytical equations for both indices that can be used to evaluate the upper bound of the predictions when the experimental uncertainty and the target distribution are available. Our considerations are general and applicable to all regression predictors.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Predictive Coding of Integers with Real-valued Predictions
    Ali, Mortuza
    Murshed, Manzur
    [J]. 2013 DATA COMPRESSION CONFERENCE (DCC), 2013, : 431 - 440
  • [2] Generalization bounds for the regression of real-valued functions
    Kil, RM
    Koo, I
    [J]. ICONIP'02: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING: COMPUTATIONAL INTELLIGENCE FOR THE E-AGE, 2002, : 1766 - 1770
  • [3] Proving bounds on real-valued functions with computations
    Melquiond, Guillaume
    [J]. AUTOMATED REASONING, PROCEEDINGS, 2008, 5195 : 2 - 17
  • [4] True risk bounds for the regression of real-valued functions
    Kil, RM
    Koo, I
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 507 - 512
  • [5] Prefix coding of integers with real-valued predictions using cosets
    Ali, Mortuza
    Murshed, Manzur
    [J]. IEEE COMMUNICATIONS LETTERS, 2007, 11 (10) : 814 - 816
  • [6] INTRINSIC BOUNDS ON SOME REAL-VALUED STATIONARY RANDOM FUNCTIONS
    BORELL, C
    [J]. LECTURE NOTES IN MATHEMATICS, 1985, 1153 : 72 - 95
  • [7] Using real-valued meta classifiers to integrate binding site predictions
    Sun, Y
    Robinson, M
    Adams, R
    Kaye, P
    Rust, AG
    Davey, N
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 481 - 486
  • [8] Using real-valued meta classifiers to integrate and contextualize binding site predictions
    Robinson, Mark
    Sharabi, Offer
    Sun, Yi
    Adams, Rod
    Boekhorst, Rene te
    Rust, Alistair G.
    Davey, Neil
    [J]. ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, PT 1, 2007, 4431 : 822 - +
  • [9] Integrating genomic binding site predictions using real-valued meta classifiers
    Yi Sun
    Mark Robinson
    Rod Adams
    Rene te Boekhorst
    Alistair G. Rust
    Neil Davey
    [J]. Neural Computing and Applications, 2009, 18 : 577 - 590
  • [10] Integrating genomic binding site predictions using real-valued meta classifiers
    Sun, Yi
    Robinson, Mark
    Adams, Rod
    te Boekhorst, Rene
    Rust, Alistair G.
    Davey, Neil
    [J]. NEURAL COMPUTING & APPLICATIONS, 2009, 18 (06): : 577 - 590