Training Set Construction for Genomic Prediction in Auto-Tetraploids: An Example in Potato

被引:0
|
作者
Wilson, Stefan [1 ]
Malosetti, Marcos [1 ]
Maliepaard, Chris [2 ]
Mulder, Han A. [3 ]
Visser, Richard G. F. [2 ]
van Eeuwijk, Fred [1 ]
机构
[1] Wageningen Univ & Res, Biometris, Wageningen, Netherlands
[2] Wageningen Univ & Res, Plant Breeding, Wageningen, Netherlands
[3] Wageningen Univ & Res, Anim Breeding & Genom, Wageningen, Netherlands
来源
关键词
training set construction; potato; sampling technique(s); genomic prediction (GP); auto-tetraploid; POPULATION-STRUCTURE; GENETIC-DISTANCE; R-PACKAGE; SELECTION; REGRESSION; PLANT; INDIVIDUALS; TRAITS;
D O I
10.3389/fpls.2021.771075
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
Training set construction is an important prerequisite to Genomic Prediction (GP), and while this has been studied in diploids, polyploids have not received the same attention. Polyploidy is a common feature in many crop plants, like for example banana and blueberry, but also potato which is the third most important crop in the world in terms of food consumption, after rice and wheat. The aim of this study was to investigate the impact of different training set construction methods using a publicly available diversity panel of tetraploid potatoes. Four methods of training set construction were compared: simple random sampling, stratified random sampling, genetic distance sampling and sampling based on the coefficient of determination (CDmean). For stratified random sampling, population structure analyses were carried out in order to define sub-populations, but since sub-populations accounted for only 16.6% of genetic variation, there were negligible differences between stratified and simple random sampling. For genetic distance sampling, four genetic distance measures were compared and though they performed similarly, Euclidean distance was the most consistent. In the majority of cases the CDmean method was the best sampling method, and compared to simple random sampling gave improvements of 4-14% in cross-validation scenarios, and 2-8% in scenarios with an independent test set, while genetic distance sampling gave improvements of 5.5-10.5% and 0.4-4.5%. No interaction was found between sampling method and the statistical model for the traits analyzed.
引用
收藏
页数:16
相关论文
共 12 条
  • [1] Training set optimization of genomic prediction by means of EthAcc
    Mangin, Brigitte
    Rincent, Renaud
    Rabier, Charles-Elie
    Moreau, Laurence
    Goudemand-Dugue, Ellen
    PLOS ONE, 2019, 14 (02):
  • [2] Training set design in genomic prediction with multiple biparental families
    Zhu, Xintian
    Leiser, Willmar L.
    Hahn, Volker
    Wuerschum, Tobias
    PLANT GENOME, 2021, 14 (03):
  • [3] Sample size determination for training set optimization in genomic prediction
    Wu, Po-Ya
    Ou, Jen-Hsiang
    Liao, Chen-Tuo
    THEORETICAL AND APPLIED GENETICS, 2023, 136 (03)
  • [4] Sample size determination for training set optimization in genomic prediction
    Po-Ya Wu
    Jen-Hsiang Ou
    Chen-Tuo Liao
    Theoretical and Applied Genetics, 2023, 136
  • [5] Genomic prediction and training set optimization in a structured Mediterranean oat population
    Simon Rio
    Luis Gallego-Sánchez
    Gracia Montilla-Bascón
    Francisco J. Canales
    Julio Isidro y Sánchez
    Elena Prats
    Theoretical and Applied Genetics, 2021, 134 : 3595 - 3609
  • [6] Genomic prediction and training set optimization in a structured Mediterranean oat population
    Rio, Simon
    Gallego-Sanchez, Luis
    Montilla-Bascon, Gracia
    Canales, Francisco J.
    Sanchez, Julio Isidro y
    Prats, Elena
    THEORETICAL AND APPLIED GENETICS, 2021, 134 (11) : 3595 - 3609
  • [7] Genomic prediction in hybrid breeding: I. Optimizing the training set design
    Albrecht E. Melchinger
    Rohan Fernando
    Christian Stricker
    Chris-Carolin Schön
    Hans-Jürgen Auinger
    Theoretical and Applied Genetics, 2023, 136
  • [8] Genomic prediction in hybrid breeding: I. Optimizing the training set design
    Melchinger, Albrecht E. E.
    Fernando, Rohan
    Stricker, Christian
    Schoen, Chris-Carolin
    Auinger, Hans-Juergen
    THEORETICAL AND APPLIED GENETICS, 2023, 136 (08)
  • [9] Forecasting the accuracy of genomic prediction with different selection targets in the training and prediction set as well as truncation selection
    Pascal Schopp
    Christian Riedelsheimer
    H. Friedrich Utz
    Chris-Carolin Schön
    Albrecht E. Melchinger
    Theoretical and Applied Genetics, 2015, 128 : 2189 - 2201
  • [10] Forecasting the accuracy of genomic prediction with different selection targets in the training and prediction set as well as truncation selection
    Schopp, Pascal
    Riedelsheimer, Christian
    Utz, H. Friedrich
    Schoen, Chris-Carolin
    Melchinger, Albrecht E.
    THEORETICAL AND APPLIED GENETICS, 2015, 128 (11) : 2189 - 2201