Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling

被引:15
|
作者
Ros-Freixedes, Roger [1 ,2 ,3 ]
Whalen, Andrew [1 ,2 ]
Gorjanc, Gregor [1 ,2 ]
Mileham, Alan J. [4 ]
Hickey, John M. [1 ,2 ]
机构
[1] Univ Edinburgh, Roslin Inst, Easter Bush, Midlothian, Scotland
[2] Univ Edinburgh, Royal Dick Sch Vet Studies, Easter Bush, Midlothian, Scotland
[3] Univ Lleida, Agrotecnio Ctr, Dept Ciencia Anim, Lleida, Spain
[4] Genus Plc, 1525 River Rd, De Forest, WI 53532 USA
基金
英国生物技术与生命科学研究理事会; “创新英国”项目;
关键词
GENOTYPE IMPUTATION; COMPLEX TRAITS; MISSING GENOTYPES; SELECTION; ANIMALS; PREDICTION; INFERENCE; MILLIONS; DESIGN; PHASE;
D O I
10.1186/s12711-020-00537-7
中图分类号
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号
0905 ;
摘要
Background For assembling large whole-genome sequence datasets for routine use in research and breeding, the sequencing strategy should be adapted to the methods that will be used later for variant discovery and imputation. In this study, we used simulation to explore the impact that the sequencing strategy and level of sequencing investment have on the overall accuracy of imputation using hybrid peeling, a pedigree-based imputation method that is well suited for large livestock populations. Methods We simulated marker array and whole-genome sequence data for 15 populations with simulated or real pedigrees that had different structures. In these populations, we evaluated the effect on imputation accuracy of seven methods for selecting which individuals to sequence, the generation of the pedigree to which the sequenced individuals belonged, the use of variable or uniform coverage, and the trade-off between the number of sequenced individuals and their sequencing coverage. For each population, we considered four levels of investment in sequencing that were proportional to the size of the population. Results Imputation accuracy depended greatly on pedigree depth. The distribution of the sequenced individuals across the generations of the pedigree underlay the performance of the different methods used to select individuals to sequence and it was critical for achieving high imputation accuracy in both early and late generations. Imputation accuracy was highest with a uniform coverage across the sequenced individuals of 2x rather than variable coverage. An investment equivalent to the cost of sequencing 2% of the population at 2x provided high imputation accuracy. The gain in imputation accuracy from additional investment decreased with larger populations and higher levels of investment. However, to achieve the same imputation accuracy, a proportionally greater investment must be used in the smaller populations compared to the larger ones. Conclusions Suitable sequencing strategies for subsequent imputation with hybrid peeling involve sequencing similar to 2% of the population at a uniform coverage 2x, distributed preferably across all generations of the pedigree, except for the few earliest generations that lack genotyped ancestors. Such sequencing strategies are beneficial for generating whole-genome sequence data in populations with deep pedigrees of closely related individuals.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] Application of Whole-Genome Sequencing to Evaluation of Gene Fusions in Solid Tumors
    Sun, L.
    Russler-Germain, D.
    Schroeder, M.
    Du, F.
    Devarakonda, S.
    Robinson, J.
    Khanna, A.
    Spencer, D.
    Duncavage, E.
    [J]. JOURNAL OF MOLECULAR DIAGNOSTICS, 2022, 24 (10): : S110 - S110
  • [42] Evaluation of whole-genome DNA methylation sequencing library preparation protocols
    Morrison, Jacob
    Koeman, Julie M.
    Johnson, Benjamin K.
    Foy, Kelly K.
    Beddows, Ian
    Zhou, Wanding
    Chesla, David W.
    Rossell, Larissa L.
    Siegwald, Emily J.
    Adams, Marie
    Shen, Hui
    [J]. EPIGENETICS & CHROMATIN, 2021, 14 (01)
  • [43] Evaluation of artificial selection in Standard Poodles using whole-genome sequencing
    Steven G. Friedenberg
    Kathryn M. Meurs
    Trudy F. C. Mackay
    [J]. Mammalian Genome, 2016, 27 : 599 - 609
  • [44] Evaluation of saliva as a source of accurate whole-genome and microbiome sequencing data
    Herzig, Anthony F.
    Velo-Suarez, Lourdes
    Le Folgoc, Gaelle
    Boland, Anne
    Blanche, Helene
    Olaso, Robert
    Le Roux, Liana
    Delmas, Christelle
    Goldberg, Marcel
    Zins, Marie
    Lethimonnier, Franck
    Deleuze, Jean-Francois
    Genin, Emmanuelle
    [J]. GENETIC EPIDEMIOLOGY, 2021, 45 (05) : 537 - 548
  • [45] Evaluation of whole-genome DNA methylation sequencing library preparation protocols
    Jacob Morrison
    Julie M. Koeman
    Benjamin K. Johnson
    Kelly K. Foy
    Ian Beddows
    Wanding Zhou
    David W. Chesla
    Larissa L. Rossell
    Emily J. Siegwald
    Marie Adams
    Hui Shen
    [J]. Epigenetics & Chromatin, 14
  • [46] Evaluation of artificial selection in Standard Poodles using whole-genome sequencing
    Friedenberg, Steven G.
    Meurs, Kathryn M.
    Mackay, Trudy F. C.
    [J]. MAMMALIAN GENOME, 2016, 27 (11-12) : 599 - 609
  • [47] Exploring the optimal strategy of imputation from SNP array to whole-genome sequencing data in farm animals
    Jiang, Yifan
    Song, Hailiang
    Gao, Hongding
    Zhang, Qin
    Ding, Xiangdong
    [J]. FRONTIERS IN GENETICS, 2022, 13
  • [48] Extent to which array genotyping and imputation with large reference panels approximate deep whole-genome sequencing
    Hanks, Sarah C.
    Forer, Lukas
    Schoenherr, Sebastian
    LeFaive, Jonathon
    Martins, Taylor
    Welch, Ryan
    Taliun, Sarah A. Gagliano
    Braff, David
    Johnsen, Jill M.
    Kenny, Eimear E.
    Konkle, Barbara A.
    Laakso, Markku
    Loos, Ruth F. J.
    McCarroll, Steven
    Pato, Carlos
    Pato, Michele T.
    Smith, Albert, V
    Boehnke, Michael
    Scott, Laura J.
    Fuchsberger, Christian
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2022, 109 (09) : 1653 - 1666
  • [49] Strategies and tools for whole-genome alignments
    Couronne, O
    Poliakov, A
    Bray, N
    Ishkhanov, T
    Ryaboy, D
    Rubin, E
    Pachter, L
    Dubchak, I
    [J]. GENOME RESEARCH, 2003, 13 (01) : 73 - 80
  • [50] Comparison of Whole-Genome Sequencing Data to Imputation Data for Cases with Venous Thromboembolism from the GENEVA Study
    Coombes, Brandon J.
    de Andrade, Mariza
    [J]. GENETIC EPIDEMIOLOGY, 2017, 41 (07) : 669 - 669