Accounting for genotype uncertainty in the estimation of allele frequencies in autopolyploids

被引:27
|
作者
Blischak, Paul D. [1 ]
Kubatko, Laura S. [1 ,2 ]
Wolfe, Andrea D. [1 ]
机构
[1] Ohio State Univ, Dept Evolut Ecol & Organismal Biol, 318 W 12th Ave, Columbus, OH 43210 USA
[2] Ohio State Univ, Dept Stat, 1958 Neil Ave, Columbus, OH 43210 USA
基金
美国国家科学基金会;
关键词
allelic dosage uncertainty; genotyping by sequencing; hierarchical Bayesian modelling; polyploidy; population genomics; RADseq; SNP DISCOVERY; POLYPLOIDY; GENETICS; INTROGRESSION; SEGREGATION; INHERITANCE; ADAPTATION; COALESCENT;
D O I
10.1111/1755-0998.12493
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Despite the increasing opportunity to collect large-scale data sets for population genomic analyses, the use of high-throughput sequencing to study populations of polyploids has seen little application. This is due in large part to problems associated with determining allele copy number in the genotypes of polyploid individuals (allelic dosage uncertainty-ADU), which complicates the calculation of important quantities such as allele frequencies. Here, we describe a statistical model to estimate biallelic SNP frequencies in a population of autopolyploids using high-throughput sequencing data in the form of read counts. We bridge the gap from data collection (using restriction enzyme based techniques [e.g. GBS, RADseq]) to allele frequency estimation in a unified inferential framework using a hierarchical Bayesian model to sum over genotype uncertainty. Simulated data sets were generated under various conditions for tetraploid, hexaploid and octoploid populations to evaluate the model's performance and to help guide the collection of empirical data. We also provide an implementation of our model in the R package polyfreqs and demonstrate its use with two example analyses that investigate (i) levels of expected and observed heterozygosity and (ii) model adequacy. Our simulations show that the number of individuals sampled from a population has a greater impact on estimation error than sequencing coverage. The example analyses also show that our model and software can be used to make inferences beyond the estimation of allele frequencies for autopolyploids by providing assessments of model adequacy and estimates of heterozygosity.
引用
收藏
页码:742 / 754
页数:13
相关论文
共 50 条
  • [21] HLA-DQ alpha genotype and allele frequencies in an Austrian population
    Ambach, EA
    Zehethofer, K
    Scheithauer, R
    HUMAN HEREDITY, 1996, 46 (02) : 71 - 75
  • [22] Reconstructing SNP allele and genotype frequencies from GWAS summary statistics
    Zhiyu Yang
    Peristera Paschou
    Petros Drineas
    Scientific Reports, 12
  • [23] Reconstructing SNP allele and genotype frequencies from GWAS summary statistics
    Yang, Zhiyu
    Paschou, Peristera
    Drineas, Petros
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [24] Allele, Genotype and Haplotype Frequencies of the HLA System in the Province of Neuquen, Argentina
    Navello, Mariano
    Mueller, Maria Constanza
    Riboldi, Maria Victoria
    Rastellini, Carolina Victoria
    TRANSPLANTATION, 2022, 106 (09) : S566 - S566
  • [25] Estimation of the covariance structure from SNP allele frequencies
    van Waaij, Jan
    Li, Zilong
    Wiuf, Carsten
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2022, 21 (01)
  • [26] Power for genetic association studies with random allele frequencies and genotype distributions
    Ambrosius, WT
    Lange, EM
    Langefeld, CD
    AMERICAN JOURNAL OF HUMAN GENETICS, 2004, 74 (04) : 683 - 693
  • [27] Human Neutrophil Antigen Genotype and Allele Frequencies in Iranian Blood Donors
    Esmaeili, Behnaz
    Bayat, Behnaz
    Alirezaee, Atefe
    Delkhah, Mona
    Mehdizadeh, Mohammad Reza
    Pourpak, Zahra
    JOURNAL OF IMMUNOLOGY RESEARCH, 2022, 2022
  • [28] A CHANGE IN ALLELE AND GENOTYPE FREQUENCIES OF HALOTHANE LOCUS IN BOARS OF THE LANDRACE BREED
    KUCIEL, J
    DVORAK, J
    ZIVOCISNA VYROBA, 1992, 37 (08): : 633 - 638
  • [29] Estimation of German KIR Allele Group Haplotype Frequencies
    Solloch, Ute V.
    Schefzyk, Daniel
    Schaefer, Gesine
    Massalski, Carolin
    Kohler, Maja
    Pruschke, Jens
    Heidl, Annett
    Schetelig, Johannes
    Schmidt, Alexander H.
    Lange, Vinzenz
    Sauter, Juergen
    FRONTIERS IN IMMUNOLOGY, 2020, 11