The effect of rare alleles on estimated genomic relationships from whole genome sequence data

被引:31
|
作者
Eynard, Sonia E. [1 ,2 ,3 ,4 ]
Windig, Jack J. [1 ,4 ]
Leroy, Gregoire [2 ,3 ]
van Binsbergen, Rianne [1 ,5 ]
Calus, Mario P. L. [1 ]
机构
[1] Wageningen UR Livestock Res, Anim Breeding & Genom Ctr, NL-6700 AH Wageningen, Netherlands
[2] AgroParisTech, UMR Genet Anim & Biol Integrat 1313, F-75231 Paris 05, France
[3] INRA, UMR Genet Anim & Biol Integrat 1313, F-78350 Jouy En Josas, France
[4] Wageningen UR, Ctr Genet Resources Netherlands, NL-6700 AA Wageningen, Netherlands
[5] Wageningen UR, Biometris, NL-6700 AA Wageningen, Netherlands
来源
BMC GENETICS | 2015年 / 16卷
关键词
Whole genome sequence; Additive genetic relationship; Rare variants; Minor allele frequency; Inbreeding; PEDIGREE; CONSERVATION; INFORMATION; POPULATION; ACCURACY; COEFFICIENTS; IMPROVEMENT; CHALLENGES; PREDICTION; IMPUTATION;
D O I
10.1186/s12863-015-0185-0
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Background: Relationships between individuals and inbreeding coefficients are commonly used for breeding decisions, but may be affected by the type of data used for their estimation. The proportion of variants with low Minor Allele Frequency (MAF) is larger in whole genome sequence (WGS) data compared to Single Nucleotide Polymorphism (SNP) chips. Therefore, WGS data provide true relationships between individuals and may influence breeding decisions and prioritisation for conservation of genetic diversity in livestock. This study identifies differences between relationships and inbreeding coefficients estimated using pedigree, SNP or WGS data for 118 Holstein bulls from the 1000 Bull genomes project. To determine the impact of rare alleles on the estimates we compared three scenarios of MAF restrictions: variants with a MAF higher than 5%, variants with a MAF higher than 1% and variants with a MAF between 1% and 5%. Results: We observed significant differences between estimated relationships and, although less significantly, inbreeding coefficients from pedigree, SNP or WGS data, and between MAF restriction scenarios. Computed correlations between pedigree and genomic relationships, within groups with similar relationships, ranged from negative to moderate for both estimated relationships and inbreeding coefficients, but were high between estimates from SNP and WGS (0.49 to 0.99). Estimated relationships from genomic information exhibited higher variation than from pedigree. Inbreeding coefficients analysis showed that more complete pedigree records lead to higher correlation between inbreeding coefficients from pedigree and genomic data. Finally, estimates and correlations between additive genetic (A) and genomic (G) relationship matrices were lower, and variances of the relationships were larger when accounting for allele frequencies than without accounting for allele frequencies. Conclusions: Using pedigree data or genomic information, and including or excluding variants with a MAF below 5% showed significant differences in relationship and inbreeding coefficient estimates. Estimated relationships and inbreeding coefficients are the basis for selection decisions. Therefore, it can be expected that using WGS instead of SNP can affect selection decision. Inclusion of rare variants will give access to the variation they carry, which is of interest for conservation of genetic diversity.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] The effect of rare alleles on estimated genomic relationships from whole genome sequence data
    Sonia E Eynard
    Jack J Windig
    Grégoire Leroy
    Rianne van Binsbergen
    Mario PL Calus
    [J]. BMC Genetics, 16
  • [2] DECIPHERING HLA AND KIR ALLELES FROM WHOLE-GENOME DATA
    Norman, Paul J.
    Moudgil, Arnav
    Henn, Brenna M.
    Kidd, Jeffrey M.
    Wall, Jeffrey D.
    Bustamente, Carlos
    Parham, Peter
    [J]. TISSUE ANTIGENS, 2012, 79 (06): : 419 - 419
  • [3] Rare variants analysis using penalization methods for whole genome sequence data
    Akram Yazdani
    Azam Yazdani
    Eric Boerwinkle
    [J]. BMC Bioinformatics, 16
  • [4] Methods for Collapsing Multiple Rare Variants in Whole-Genome Sequence Data
    Sung, Yun Ju
    Korthauer, Keegan D.
    Swartz, Michael D.
    Engelman, Corinne D.
    [J]. GENETIC EPIDEMIOLOGY, 2014, 38 : S13 - S20
  • [5] Rare variants analysis using penalization methods for whole genome sequence data
    Yazdani, Akram
    Yazdani, Azam
    Boerwinkle, Eric
    [J]. BMC BIOINFORMATICS, 2015, 16
  • [6] Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data
    Pierrick Wainschtein
    Deepti Jain
    Zhili Zheng
    L. Adrienne Cupples
    Aladdin H. Shadyab
    Barbara McKnight
    Benjamin M. Shoemaker
    Braxton D. Mitchell
    Bruce M. Psaty
    Charles Kooperberg
    Ching-Ti Liu
    Christine M. Albert
    Dan Roden
    Daniel I. Chasman
    Dawood Darbar
    Donald M. Lloyd-Jones
    Donna K. Arnett
    Elizabeth A. Regan
    Eric Boerwinkle
    Jerome I. Rotter
    Jeffrey R. O’Connell
    Lisa R. Yanek
    Mariza de Andrade
    Matthew A. Allison
    Merry-Lynn N. McDonald
    Mina K. Chung
    Myriam Fornage
    Nathalie Chami
    Nicholas L. Smith
    Patrick T. Ellinor
    Ramachandran S. Vasan
    Rasika A. Mathias
    Ruth J. F. Loos
    Stephen S. Rich
    Steven A. Lubitz
    Susan R. Heckbert
    Susan Redline
    Xiuqing Guo
    Y. -D Ida Chen
    Cecelia A. Laurie
    Ryan D. Hernandez
    Stephen T. McGarvey
    Michael E. Goddard
    Cathy C. Laurie
    Kari E. North
    Leslie A. Lange
    Bruce S. Weir
    Loic Yengo
    Jian Yang
    Peter M. Visscher
    [J]. Nature Genetics, 2022, 54 : 263 - 273
  • [7] Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data
    Wainschtein, Pierrick
    Jain, Deepti
    Zheng, Zhili
    Cupples, L. Adrienne
    Shadyab, Aladdin H.
    McKnight, Barbara
    Shoemaker, Benjamin M.
    Mitchell, Braxton D.
    Psaty, Bruce M.
    Kooperberg, Charles
    Liu, Ching-Ti
    Albert, Christine M.
    Roden, Dan
    Chasman, Daniel, I
    Darbar, Dawood
    Lloyd-Jones, Donald M.
    Arnett, Donna K.
    Regan, Elizabeth A.
    Boerwinkle, Eric
    Rotter, Jerome, I
    O'Connell, Jeffrey R.
    Yanek, Lisa R.
    de Andrade, Mariza
    Allison, Matthew A.
    Mcdonald, Merry-Lynn N.
    Chung, Mina K.
    Fornage, Myriam
    Chami, Nathalie
    Smith, Nicholas L.
    Ellinor, Patrick T.
    Vasan, Ramachandran S.
    Mathias, Rasika A.
    Loos, Ruth J. F.
    Rich, Stephen S.
    Lubitz, Steven A.
    Heckbert, Susan R.
    Redline, Susan
    Guo, Xiuqing
    Chen, Y-D Ida
    Laurie, Cecelia A.
    Hernandez, Ryan D.
    McGarvey, Stephen T.
    Goddard, Michael E.
    Laurie, Cathy C.
    North, Kari E.
    Lange, Leslie A.
    Weir, Bruce S.
    Yengo, Loic
    Yang, Jian
    Visscher, Peter M.
    [J]. NATURE GENETICS, 2022, 54 (03) : 263 - +
  • [8] Genomic prediction with whole-genome sequence data in intensely selected pig lines
    Ros-Freixedes, Roger
    Johnsson, Martin
    Whalen, Andrew
    Chen, Ching-Yi
    Valente, Bruno D.
    Herring, William O.
    Gorjanc, Gregor
    Hickey, John M.
    [J]. GENETICS SELECTION EVOLUTION, 2022, 54 (01)
  • [9] Strategies for Obtaining and Pruning Imputed Whole-Genome Sequence Data for Genomic Prediction
    Ye, Shaopan
    Gao, Ning
    Zheng, Rongrong
    Chen, Zitao
    Teng, Jinyan
    Yuan, Xiaolong
    Zhang, Hao
    Chen, Zanmou
    Zhang, Xiquan
    Li, Jiaqi
    Zhang, Zhe
    [J]. FRONTIERS IN GENETICS, 2019, 10
  • [10] Utility of whole-genome sequence data for across-breed genomic prediction
    Biaty Raymond
    Aniek C. Bouwman
    Chris Schrooten
    Jeanine Houwing-Duistermaat
    Roel F. Veerkamp
    [J]. Genetics Selection Evolution, 50