Probabilistic performance estimators for computational chemistry methods: Systematic improvement probability and ranking probability matrix. I. Theory

被引:13
|
作者
Pernot, Pascal [1 ]
Savin, Andreas [2 ,3 ]
机构
[1] Univ Paris Saclay, CNRS, UMR8000, Inst Chim Phys, F-91405 Orsay, France
[2] CNPS, Lab Chim Theor, F-75252 Paris, France
[3] Sorbonne Univ, UPMC Univ Paris 06, F-75252 Paris, France
来源
JOURNAL OF CHEMICAL PHYSICS | 2020年 / 152卷 / 16期
关键词
PREDICTION UNCERTAINTY; STATISTICS; BOOTSTRAP; APPROXIMATIONS; EXPRESSION; GUIDE;
D O I
10.1063/5.0006202
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
The comparison of benchmark error sets is an essential tool for the evaluation of theories in computational chemistry. The standard ranking of methods by their mean unsigned error is unsatisfactory for several reasons linked to the non-normality of the error distributions and the presence of underlying trends. Complementary statistics have recently been proposed to palliate such deficiencies, such as quantiles of the absolute error distribution or the mean prediction uncertainty. We introduce here a new score, the systematic improvement probability, based on the direct system-wise comparison of absolute errors. Independent of the chosen scoring rule, the uncertainty of the statistics due to the incompleteness of the benchmark datasets is also generally overlooked. However, this uncertainty is essential to appreciate the robustness of rankings. In the present article, we develop two indicators based on robust statistics to address this problem: P-inv, the inversion probability between two values of a statistic, and P-r, the ranking probability matrix. We demonstrate also the essential contribution of the correlations between error sets in these scores comparisons.
引用
收藏
页数:15
相关论文
共 4 条