Improving the estimation of genetic distances from Next-Generation Sequencing data

被引:77
|
作者
Vieira, Filipe G. [1 ,2 ]
Lassalle, Florent [3 ]
Korneliussen, Thorfinn S. [1 ,2 ]
Fumagalli, Matteo [3 ]
机构
[1] Univ Copenhagen, Ctr GeoGenet, DK-2100 Copenhagen, Denmark
[2] Univ Copenhagen, Nat Hist Museum Denmark, Evogenom Sect, DK-2100 Copenhagen, Denmark
[3] UCL, UCL Genet Inst, Dept Genet Evolut & Environm, London WC1E 6BT, England
关键词
Bayesian inference; maximum likelihood; phylogenetics; population structure; PHYLOGENY RECONSTRUCTION; POPULATION GENOMICS; ALLELE FREQUENCY; RECOMBINATION; ASSOCIATION; POLYMORPHISM; ADAPTATION; EVOLUTION; INFERENCE; MAP;
D O I
10.1111/bij.12511
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Next-Generation Sequencing (NGS) technologies have revolutionized research in evolutionary biology, by increasing the sequencing speed and reducing the experimental costs. However, sequencing errors are higher than in traditional technologies and, furthermore, many studies rely on low-depth sequencing. Under these circumstances, the use of standard methods for inferring genotypes leads to biased estimates of nucleotide variation, which can bias all downstream analyses. Through simulations, we assessed the bias in estimating genetic distances under several different scenarios. The results indicate that naive methods for assigning individual genotypes greatly overestimate genetic distances. We propose a novel method to estimate genetic distances that is suitable for low-depth NGS data and takes genotype call statistical uncertainty into account. We applied this method to investigate the genetic structure of domesticated and wild strains of rice. We implemented this approach in an open-source software and discuss further directions of phylogenetic analyses within this novel probabilistic framework. (C) 2015 The Linnean Society of London,
引用
下载
收藏
页码:139 / 149
页数:11
相关论文
共 50 条
  • [21] Genotype and SNP calling from next-generation sequencing data
    Rasmus Nielsen
    Joshua S. Paul
    Anders Albrechtsen
    Yun S. Song
    Nature Reviews Genetics, 2011, 12 : 443 - 451
  • [22] Genotype and SNP calling from next-generation sequencing data
    Nielsen, Rasmus
    Paul, Joshua S.
    Albrechtsen, Anders
    Song, Yun S.
    NATURE REVIEWS GENETICS, 2011, 12 (06) : 443 - 451
  • [23] Pathway analysis with next-generation sequencing data
    Jinying Zhao
    Yun Zhu
    Eric Boerwinkle
    Momiao Xiong
    European Journal of Human Genetics, 2015, 23 : 507 - 515
  • [24] Focus on next-generation sequencing data analysis
    Rusk N.
    Nature Methods, 2009, 6 (Suppl 11) : S1 - S1
  • [25] Next-generation sequencing: adjusting to data overload
    Monya Baker
    Nature Methods, 2010, 7 : 495 - 499
  • [26] Visualizing next-generation sequencing data with JBrowse
    Westesson, Oscar
    Skinner, Mitchell
    Holmes, Ian
    BRIEFINGS IN BIOINFORMATICS, 2013, 14 (02) : 172 - 177
  • [27] Next-generation sequencing: adjusting to data overload
    Baker, Monya
    NATURE METHODS, 2010, 7 (07) : 495 - 499
  • [28] Identification of indels in next-generation sequencing data
    Ratan, Aakrosh
    Olson, Thomas L.
    Loughran, Thomas P., Jr.
    Miller, Webb
    BMC BIOINFORMATICS, 2015, 16
  • [29] Identification of indels in next-generation sequencing data
    Aakrosh Ratan
    Thomas L Olson
    Thomas P Loughran
    Webb Miller
    BMC Bioinformatics, 16
  • [30] Applications and data analysis of next-generation sequencing
    Vogl, Ina
    Benet-Pages, Anna
    Eck, Sebastian H.
    Kuhn, Marius
    Vosberg, Sebastian
    Greif, Philipp A.
    Metzeler, Klaus H.
    Biskup, Saskia
    Mueller-Reible, Clemens
    Klein, Hanns-Georg
    LABORATORIUMSMEDIZIN-JOURNAL OF LABORATORY MEDICINE, 2013, 37 (06): : 305 - 315