Sample size determination for training set optimization in genomic prediction

被引:9
|
作者
Wu, Po-Ya [1 ,2 ]
Ou, Jen-Hsiang [1 ,3 ]
Liao, Chen-Tuo [1 ]
机构
[1] Natl Taiwan Univ, Dept Agron, Taipei, Taiwan
[2] Heinrich Heine Univ, Inst Quant Genet & Genom Plants, Dusseldorf, Germany
[3] Uppsala Univ, Dept Med Biochem & Microbiol, Uppsala, Sweden
关键词
CALIBRATION SET; LINEAR-MODELS; SELECTION; ACCURACY; INDIVIDUALS; REGRESSION; PRECISION;
D O I
10.1007/s00122-023-04254-9
中图分类号
S3 [农学(农艺学)];
学科分类号
0901 ;
摘要
Genomic prediction (GP) is a statistical method used to select quantitative traits in animal or plant breeding. For this purpose, a statistical prediction model is first built that uses phenotypic and genotypic data in a training set. The trained model is then used to predict genomic estimated breeding values (GEBVs) for individuals within a breeding population. Setting the sample size of the training set usually takes into account time and space constraints that are inevitable in an agricultural experiment. However, the determination of the sample size remains an unresolved issue for a GP study. By applying the logistic growth curve to identify prediction accuracy for the GEBVs and the training set size, a practical approach was developed to determine a cost-effective optimal training set for a given genome dataset with known genotypic data. Three real genome datasets were used to illustrate the proposed approach. An R function is provided to facilitate widespread application of this approach to sample size determination, which can help breeders to identify a set of genotypes with an economical sample size for selective phenotyping.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Forecasting the accuracy of genomic prediction with different selection targets in the training and prediction set as well as truncation selection
    Pascal Schopp
    Christian Riedelsheimer
    H. Friedrich Utz
    Chris-Carolin Schön
    Albrecht E. Melchinger
    Theoretical and Applied Genetics, 2015, 128 : 2189 - 2201
  • [22] Forecasting the accuracy of genomic prediction with different selection targets in the training and prediction set as well as truncation selection
    Schopp, Pascal
    Riedelsheimer, Christian
    Utz, H. Friedrich
    Schoen, Chris-Carolin
    Melchinger, Albrecht E.
    THEORETICAL AND APPLIED GENETICS, 2015, 128 (11) : 2189 - 2201
  • [23] Genomic Prediction in Pea: Effect of Marker Density and Training Population Size and Composition on Prediction Accuracy
    Tayeh, Nadim
    Klein, Anthony
    Le Paslier, Marie-Christine
    Jacquin, Francoise
    Houtin, Herve
    Rond, Celine
    Chabert-Martinello, Marianne
    Magnin-Robert, Jean-Bernard
    Marget, Pascal
    Aubert, Gregoire
    Burstin, Judith
    FRONTIERS IN PLANT SCIENCE, 2015, 6
  • [24] OPTIMIZATION OF SAMPLE SIZE
    RASCH, D
    HERRENDORFER, G
    BOCK, J
    BIOMETRISCHE ZEITSCHRIFT, 1974, 16 (06): : 401 - 408
  • [25] On the determination of sample size
    Umbach, DM
    EPIDEMIOLOGY, 2003, 14 (02) : 137 - 138
  • [26] SAMPLE SIZE DETERMINATION
    HAWKES, AG
    ANNALS OF HUMAN GENETICS, 1965, 29 : 216 - &
  • [27] DETERMINATION OF SAMPLE SIZE
    WYSHAK, G
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 1973, 97 (01) : 1 - 3
  • [28] SAMPLE SIZE DETERMINATION
    PANTONY, DA
    CHEMISTRY IN BRITAIN, 1966, 2 (01) : 28 - &
  • [29] Using Surgeon Clusters to Increase IOL Formula Optimization Training Set Size
    Sarver, Edwin
    Padrick, Tom D.
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2020, 61 (07)
  • [30] Novel Sample Size Determination methods for Parsimonious training of Black box models
    Miriyala, Srinivas Soumitri
    Mitra, Kishalay
    2017 INDIAN CONTROL CONFERENCE (ICC), 2017, : 39 - 46