STELAR: a statistically consistent coalescent-based species tree estimation method by maximizing triplet consistency

被引:14
|
作者
Islam, Mazharul [1 ]
Sarker, Kowshika [1 ]
Das, Trisha [1 ]
Reaz, Rezwana [2 ]
Bayzid, Md. Shamsuzzoha [1 ]
机构
[1] Bangladesh Univ Engn & Technol, Dept Comp Sci & Engn, Dhaka 1205, Bangladesh
[2] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
关键词
Phylogenomics; Multi-species coalescent process; Gene tree incongruence; Incomplete lineage sorting; GENE TREES; GENOME; PHYLOGENOMICS; LIKELIHOOD; INFERENCE; DIVERSIFICATION; RECONSTRUCTION; ORIGIN;
D O I
10.1186/s12864-020-6519-y
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundSpecies tree estimation is frequently based on phylogenomic approaches that use multiple genes from throughout the genome. However, estimating a species tree from a collection of gene trees can be complicated due to the presence of gene tree incongruence resulting from incomplete lineage sorting (ILS), which is modelled by the multi-species coalescent process. Maximum likelihood and Bayesian MCMC methods can potentially result in accurate trees, but they do not scale well to large datasets.ResultsWe present STELAR (Species Tree Estimation by maximizing tripLet AgReement), a new fast and highly accurate statistically consistent coalescent-based method for estimating species trees from a collection of gene trees. We formalized the constrained triplet consensus (CTC) problem and showed that the solution to the CTC problem is a statistically consistent estimate of the species tree under the multi-species coalescent (MSC) model. STELAR is an efficient dynamic programming based solution to the CTC problem which is highly accurate and scalable. We evaluated the accuracy of STELAR in comparison with SuperTriplets, which is an alternate fast and highly accurate triplet-based supertree method, and with MP-EST and ASTRAL - two of the most popular and accurate coalescent-based methods. Experimental results suggest that STELAR matches the accuracy of ASTRAL and improves on MP-EST and SuperTriplets.ConclusionsTheoretical and empirical results (on both simulated and real biological datasets) suggest that STELAR is a valuable technique for species tree estimation from gene tree distributions.
引用
收藏
页数:13
相关论文
共 32 条
  • [1] STELAR: a statistically consistent coalescent-based species tree estimation method by maximizing triplet consistency
    Mazharul Islam
    Kowshika Sarker
    Trisha Das
    Rezwana Reaz
    Md. Shamsuzzoha Bayzid
    [J]. BMC Genomics, 21
  • [2] Consistency of SVDQuartets and Maximum Likelihood for Coalescent-Based Species Tree Estimation
    Wascher, Matthew
    Kubatko, Laura
    [J]. SYSTEMATIC BIOLOGY, 2021, 70 (01) : 33 - 48
  • [3] ASTRAL: genome-scale coalescent-based species tree estimation
    Mirarab, S.
    Reaz, R.
    Bayzid, Md. S.
    Zimmermann, T.
    Swenson, M. S.
    Warnow, T.
    [J]. BIOINFORMATICS, 2014, 30 (17) : I541 - I548
  • [4] Statistical Consistency of Coalescent-Based Species Tree Methods Under Models of Missing Data
    Nute, Michael
    Chou, Jed
    [J]. COMPARATIVE GENOMICS, RECOMB CG 2017, 2017, 10562 : 277 - 297
  • [5] A comparative study of SVDquartets and other coalescent-based species tree estimation methods
    Chou, Jed
    Gupta, Ashu
    Yaduvanshi, Shashank
    Davidson, Ruth
    Nute, Mike
    Mirarab, Siavash
    Warnow, Tandy
    [J]. BMC GENOMICS, 2015, 16 : 1 - 11
  • [6] Assessing the Impacts of Positive Selection on Coalescent-Based Species Tree Estimation and Species Delimitation
    Adams, Richard H.
    Schield, Drew R.
    Card, Daren C.
    Castoe, Todd A.
    [J]. SYSTEMATIC BIOLOGY, 2018, 67 (06) : 1076 - 1090
  • [7] A comparative study of SVDquartets and other coalescent-based species tree estimation methods
    Jed Chou
    Ashu Gupta
    Shashank Yaduvanshi
    Ruth Davidson
    Mike Nute
    Siavash Mirarab
    Tandy Warnow
    [J]. BMC Genomics, 16
  • [8] On the Robustness to Gene Tree Estimation Error (or lack thereof) of Coalescent-Based Species Tree Methods
    Roch, Sebastien
    Warnow, Tandy
    [J]. SYSTEMATIC BIOLOGY, 2015, 64 (04) : 663 - 676
  • [9] The performance of coalescent-based species tree estimation methods under models of missing data
    Nute, Michael
    Chou, Jed
    Molloy, Erin K.
    Warnow, Tandy
    [J]. BMC GENOMICS, 2018, 19
  • [10] Impact of Ghost Introgression on Coalescent-Based Species Tree Inference and Estimation of Divergence Time
    Pang, Xiao-Xu
    Zhang, Da-Yong
    [J]. SYSTEMATIC BIOLOGY, 2023, 72 (01) : 35 - 49