distAngsd: Fast and Accurate Inference of Genetic Distances for Next-Generation Sequencing Data

被引:1
|
作者
Zhao, Lei [1 ]
Nielsen, Rasmus [1 ,2 ,3 ]
Korneliussen, Thorfinn Sand [1 ]
机构
[1] Univ Copenhagen, Globe Inst, Sect Geogenet, Oster Voldgade 5-7, DK-1350 Copenhagen K, Denmark
[2] Univ Calif Berkeley, Dept Integrat Biol, 3040 Valley Life Sci Bldg 3140, Berkeley, CA 94720 USA
[3] Univ Calif Berkeley, Dept Stat, 3040 Valley Life Sci Bldg 3140, Berkeley, CA 94720 USA
关键词
phylogeny reconstruction; genotype likelihood; genetic distance; high-throughput sequencing; next-generation sequencing; molecular evolution; maximum likelihood; expectation maximization; HAPLOTYPE RECONSTRUCTION; MAXIMUM-LIKELIHOOD; DNA; MITOCHONDRIAL; SITES; SUBSTITUTIONS; ASSOCIATION; FRAMEWORK; GENOTYPE; GENOMES;
D O I
10.1093/molbev/msac119
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Commonly used methods for inferring phylogenies were designed before the emergence of high-throughput sequencing and can generally not accommodate the challenges associated with noisy, diploid sequencing data. In many applications, diploid genomes are still treated as haploid through the use of ambiguity characters; while the uncertainty in genotype calling-arising as a consequence of the sequencing technology-is ignored. In order to address this problem, we describe two new probabilistic approaches for estimating genetic distances: distAngsd-geno and distAngsd-nuc, both implemented in a software suite named distAngsd. These methods are specifically designed for next-generation sequencing data, utilize the full information from the data, and take uncertainty in genotype calling into account. Through extensive simulations, we show that these new methods are markedly more accurate and have more stable statistical behaviors than other currently available methods for estimating genetic distances-even for very low depth data with high error rates.
引用
收藏
页码:1084 / 1097
页数:14
相关论文
共 50 条
  • [21] Variational inference for rare variant detection in deep, heterogeneous next-generation sequencing data
    Fan Zhang
    Patrick Flaherty
    BMC Bioinformatics, 18
  • [22] T-lex: a program for fast and accurate assessment of transposable element presence using next-generation sequencing data
    Fiston-Lavier, Anna-Sophie
    Carrigan, Matthew
    Petrov, Dmitri A.
    Gonzalez, Josefa
    NUCLEIC ACIDS RESEARCH, 2011, 39 (06) : e36
  • [23] FastProNGS: fast preprocessing of next-generation sequencing reads
    Liu, Xiaoshuang
    Yan, Zhenhe
    Wu, Chao
    Yang, Yang
    Li, Xiaomin
    Zhang, Guangxin
    BMC BIOINFORMATICS, 2019, 20 (1)
  • [24] Fast and Easy DNA Quantification for Next-Generation Sequencing
    Genetic Engineering and Biotechnology News, 2021, 41 (10): : 53
  • [25] Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads
    Hongshan Jiang
    Rong Lei
    Shou-Wei Ding
    Shuifang Zhu
    BMC Bioinformatics, 15
  • [26] Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads
    Jiang, Hongshan
    Lei, Rong
    Ding, Shou-Wei
    Zhu, Shuifang
    BMC BIOINFORMATICS, 2014, 15
  • [27] FastProNGS: fast preprocessing of next-generation sequencing reads
    Xiaoshuang Liu
    Zhenhe Yan
    Chao Wu
    Yang Yang
    Xiaomin Li
    Guangxin Zhang
    BMC Bioinformatics, 20
  • [28] Pathway analysis with next-generation sequencing data
    Jinying Zhao
    Yun Zhu
    Eric Boerwinkle
    Momiao Xiong
    European Journal of Human Genetics, 2015, 23 : 507 - 515
  • [29] Focus on next-generation sequencing data analysis
    Rusk N.
    Nature Methods, 2009, 6 (Suppl 11) : S1 - S1
  • [30] Next-generation sequencing: adjusting to data overload
    Monya Baker
    Nature Methods, 2010, 7 : 495 - 499