Using the Pearson's correlation coefficient as the sole metric to measure the accuracy of quantitative trait prediction: is it sufficient?

被引:1
|
作者
Pan, Shouhui [1 ,2 ]
Liu, Zhongqiang [1 ,2 ]
Han, Yanyun [1 ,2 ]
Zhang, Dongfeng [1 ,2 ]
Zhao, Xiangyu [1 ,2 ]
Li, Jinlong [1 ,2 ]
Wang, Kaiyi [1 ,2 ]
机构
[1] Beijing Acad Agr & Forestry Sci, Informat Technol Res Ctr, Beijing, Peoples R China
[2] Natl Engn Res Ctr Informat Technol Agr, Beijing, Peoples R China
来源
关键词
genomic selection; quantitative trait prediction; Pearson's correlation coefficient; evaluation metric; regression prediction;
D O I
10.3389/fpls.2024.1480463
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
How to evaluate the accuracy of quantitative trait prediction is crucial to choose the best model among several possible choices in plant breeding. Pearson's correlation coefficient (PCC), serving as a metric for quantifying the strength of the linear association between two variables, is widely used to evaluate the accuracy of the quantitative trait prediction models, and generally performs well in most circumstances. However, PCC may not always offer a comprehensive view of predictive accuracy, especially in cases involving nonlinear relationships or complex dependencies in machine learning-based methods. It has been found that many papers on quantitative trait prediction solely use PCC as a single metric to evaluate the accuracy of their models, which is insufficient and limited from a formal perspective. This study addresses this crucial issue by presenting a typical example and conducting a comparative analysis of PCC and nine other evaluation metrics using four traditional methods and four machine learning-based methods, thereby contributing to the improvement of practical applicability and reliability of plant quantitative trait prediction models. It is recommended to employ PCC in conjunction with other evaluation metrics in a targeted manner based on specific application scenarios to reduce the likelihood of drawing misleading conclusions.
引用
收藏
页数:6
相关论文
共 23 条
  • [21] Accuracy of prediction of simulated polygenic phenotypes and their underlying quantitative trait loci genotypes using real or imputed whole-genome markers in cattle
    Hassani, Saeed
    Saatchi, Mahdi
    Fernando, Rohan L.
    Garrick, Dorian J.
    GENETICS SELECTION EVOLUTION, 2015, 47
  • [22] Improving multi-population genomic prediction accuracy using multi-trait GBLUP models which incorporate global or local genetic correlation information
    Teng, Jun
    Zhai, Tingting
    Zhang, Xinyi
    Zhao, Changheng
    Wang, Wenwen
    Tang, Hui
    Wang, Dan
    Shang, Yingli
    Ning, Chao
    Zhang, Qin
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (04)
  • [23] Benchmarking Socio-Economic Impacts of High-Speed Rail Networks Using K-Nearest Neighbour and Pearson's Correlation Coefficient Techniques through Computational Model-Based Analysis
    Rungskunroch, Panrawee
    Shen, Zuo-Jun
    Kaewunruen, Sakdirat
    APPLIED SCIENCES-BASEL, 2022, 12 (03):