SNP genotyping and parameter estimation in polyploids using low-coverage sequencing data

被引:52
|
作者
Blischak, Paul D. [1 ]
Kubatko, Laura S. [1 ,2 ]
Wolfe, Andrea D. [1 ]
机构
[1] Ohio State Univ, Dept Evolut Ecol & Organismal Biol, Columbus, OH 43210 USA
[2] Ohio State Univ, Dept Stat, Columbus, OH 43210 USA
基金
美国国家科学基金会;
关键词
POPULATION GENETIC-STRUCTURE; MAXIMUM-LIKELIHOOD; GENOME; DISCOVERY; FRAMEWORK; LOCI; DIFFERENTIATION; FREQUENCY; DOMINANT;
D O I
10.1093/bioinformatics/btx587
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Genotyping and parameter estimation using high throughput sequencing data are everyday tasks for population geneticists, but methods developed for diploids are typically not applicable to polyploid taxa. This is due to their duplicated chromosomes, as well as the complex patterns of allelic exchange that often accompany whole genome duplication (WGD) events. For WGDs within a single lineage (autopolyploids), inbreeding can result from mixed mating and/or double reduction. For WGDs that involve hybridization (allopolyploids), alleles are typically inherited through independently segregating subgenomes. Results: We present two new models for estimating genotypes and population genetic parameters from genotype likelihoods for auto-and allopolyploids. We then use simulations to compare these models to existing approaches at varying depths of sequencing coverage and ploidy levels. These simulations show that our models typically have lower levels of estimation error for genotype and parameter estimates, especially when sequencing coverage is low. Finally, we also apply these models to two empirical datasets from the literature. Overall, we show that the use of genotype likelihoods to model non-standard inheritance patterns is a promising approach for conducting population genomic inferences in polyploids.
引用
收藏
页码:407 / 415
页数:9
相关论文
共 50 条
  • [21] Genomic prediction using low-coverage portable Nanopore sequencing
    Lamb, Harrison J.
    Hayes, Ben J.
    Randhawa, Imtiaz A. S.
    Nguyen, Loan T.
    Ross, Elizabeth M.
    PLOS ONE, 2021, 16 (12):
  • [22] ACE: absolute copy number estimation from low-coverage whole-genome sequencing data
    Poell, Jos B.
    Mendeville, Matias
    Sie, Daoud
    Brink, Arjen
    Brakenhoff, Ruud H.
    Ylstra, Bauke
    BIOINFORMATICS, 2019, 35 (16) : 2847 - 2849
  • [23] PMAT: an efficient plant mitogenome assembly toolkit using low-coverage HiFi sequencing data
    Bi, Changwei
    Shen, Fei
    Han, Fuchuan
    Qu, Yanshu
    Hou, Jing
    Xu, Kewang
    Xu, Li-an
    He, Wenchuang
    Wu, Zhiqiang
    Yin, Tongming
    HORTICULTURE RESEARCH, 2024, 11 (03)
  • [24] Fast and Accurate 1000 Genomes Imputation Using Summary Statistics or Low-coverage Sequencing Data
    Pasaniuc, Bogdan
    Zaitlen, Noah
    Bhatia, Gaurav
    Gusev, Alexander
    Patterson, Nick
    Price, Alkes L.
    GENETIC EPIDEMIOLOGY, 2012, 36 (07) : 765 - 765
  • [25] Estimating microhaplotype allele frequencies from low-coverage or pooled sequencing data
    Thomas A. Delomas
    Stuart C. Willis
    BMC Bioinformatics, 24
  • [26] Estimating microhaplotype allele frequencies from low-coverage or pooled sequencing data
    Delomas, Thomas A.
    Willis, Stuart C.
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [27] Characterizing Bias in Population Genetic Inferences from Low-Coverage Sequencing Data
    Han, Eunjung
    Sinsheimer, Janet S.
    Novembre, John
    MOLECULAR BIOLOGY AND EVOLUTION, 2014, 31 (03) : 723 - 735
  • [28] Publisher Correction: Efficient phasing and imputation of low-coverage sequencing data using large reference panels
    Simone Rubinacci
    Diogo M. Ribeiro
    Robin J. Hofmeister
    Olivier Delaneau
    Nature Genetics, 2021, 53 : 412 - 412
  • [29] Estimation of site frequency spectra from low-coverage sequencing data using stochastic EM reduces overfitting, runtime, and memory usage
    Rasmussen, Malthe Sebro
    Garcia-Erill, Genis
    Korneliussen, Thorfinn Sand
    Wiuf, Carsten
    Albrechtsen, Anders
    GENETICS, 2022, 222 (04)
  • [30] Detecting selection in low-coverage high-throughput sequencing data using principal component analysis
    Meisner, Jonas
    Albrechtsen, Anders
    Hanghoj, Kristian
    BMC BIOINFORMATICS, 2021, 22 (01)