STELAR: a statistically consistent coalescent-based species tree estimation method by maximizing triplet consistency

被引:14
|
作者
Islam, Mazharul [1 ]
Sarker, Kowshika [1 ]
Das, Trisha [1 ]
Reaz, Rezwana [2 ]
Bayzid, Md. Shamsuzzoha [1 ]
机构
[1] Bangladesh Univ Engn & Technol, Dept Comp Sci & Engn, Dhaka 1205, Bangladesh
[2] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
关键词
Phylogenomics; Multi-species coalescent process; Gene tree incongruence; Incomplete lineage sorting; GENE TREES; GENOME; PHYLOGENOMICS; LIKELIHOOD; INFERENCE; DIVERSIFICATION; RECONSTRUCTION; ORIGIN;
D O I
10.1186/s12864-020-6519-y
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundSpecies tree estimation is frequently based on phylogenomic approaches that use multiple genes from throughout the genome. However, estimating a species tree from a collection of gene trees can be complicated due to the presence of gene tree incongruence resulting from incomplete lineage sorting (ILS), which is modelled by the multi-species coalescent process. Maximum likelihood and Bayesian MCMC methods can potentially result in accurate trees, but they do not scale well to large datasets.ResultsWe present STELAR (Species Tree Estimation by maximizing tripLet AgReement), a new fast and highly accurate statistically consistent coalescent-based method for estimating species trees from a collection of gene trees. We formalized the constrained triplet consensus (CTC) problem and showed that the solution to the CTC problem is a statistically consistent estimate of the species tree under the multi-species coalescent (MSC) model. STELAR is an efficient dynamic programming based solution to the CTC problem which is highly accurate and scalable. We evaluated the accuracy of STELAR in comparison with SuperTriplets, which is an alternate fast and highly accurate triplet-based supertree method, and with MP-EST and ASTRAL - two of the most popular and accurate coalescent-based methods. Experimental results suggest that STELAR matches the accuracy of ASTRAL and improves on MP-EST and SuperTriplets.ConclusionsTheoretical and empirical results (on both simulated and real biological datasets) suggest that STELAR is a valuable technique for species tree estimation from gene tree distributions.
引用
收藏
页数:13
相关论文
共 32 条
  • [21] COALESCENT-BASED SPECIES TREE INFERENCE FROM GENE TREE TOPOLOGIES UNDER INCOMPLETE LINEAGE SORTING BY MAXIMUM LIKELIHOOD
    Wu, Yufeng
    [J]. EVOLUTION, 2012, 66 (03) : 763 - 775
  • [22] Data Concatenation, Bayesian Concordance and Coalescent-Based Analyses of the Species Tree for the Rapid Radiation of Triturus Newts
    Wielstra, Ben
    Arntzen, Jan W.
    van der Gaag, Kristiaan J.
    Pabijan, Maciej
    Babik, Wieslaw
    [J]. PLOS ONE, 2014, 9 (10):
  • [23] QR-STAR: A Polynomial-Time Statistically Consistent Method for Rooting Species Trees Under the Coalescent
    Tabatabaee, Yasamin
    Roch, Sebastien
    Warnow, Tandy
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2023, 30 (11) : 1146 - 1181
  • [24] SPECIES DELIMITATION IN THE LICHENIZED FUNGAL GENUS VULPICIDA (PARMELIACEAE, ASCOMYCOTA) USING GENE CONCATENATION AND COALESCENT-BASED SPECIES TREE APPROACHES
    Saag, Lauri
    Mark, Kristiina
    Saag, Andres
    Randlane, Tiina
    [J]. AMERICAN JOURNAL OF BOTANY, 2014, 101 (12) : 2169 - 2182
  • [25] Coalescent-based delimitation and species-tree estimations reveal Appalachian origin and Neogene diversification in Russula subsection Roseinae
    Looney, Brian P.
    Adamcik, Slavomir
    Matheny, P. Brandon
    [J]. MOLECULAR PHYLOGENETICS AND EVOLUTION, 2020, 147
  • [26] Discovering cryptic species in the Aspiciliella intermutans complex (Megasporaceae, Ascomycota) - First results using gene concatenation and coalescent-based species tree approaches
    Zakeri, Zakieh
    Otte, Volker
    Sipman, Harrie
    Malicek, Jiri
    Cubas, Paloma
    Rico, Victor J.
    Lenzova, Veronika
    Svoboda, David
    Divakar, Pradeep K.
    [J]. PLOS ONE, 2019, 14 (05):
  • [27] STELLS2: fast and accurate coalescent-based maximum likelihood inference of species trees from gene tree topologies
    Pei, Jingwen
    Wu, Yufeng
    [J]. BIOINFORMATICS, 2017, 33 (12) : 1789 - 1797
  • [28] How challenging RADseq data turned out to favor coalescent-based species tree inference. A case study in Aichryson (Crassulaceae)
    Huehn, Philipp
    Dillenberger, Markus S.
    Gerschwitz-Eidt, Michael
    Hoerandl, Elvira
    Los, Jessica A.
    Messerschmid, Thibaud F. E.
    Paetzold, Claudia
    Rieger, Benjamin
    Kadereit, Gudrun
    [J]. MOLECULAR PHYLOGENETICS AND EVOLUTION, 2022, 167
  • [29] Species Tree Estimation from Gene Trees by Minimizing Deep Coalescence and Maximizing Quartet Consistency: A Comparative Study and the Presence of Pseudo Species Tree Terraces
    Farah, Ishrat Tanzila
    Islam, Muktadirul
    Zinat, Kazi Tasnim
    Rahman, Atif Hasan
    Bayzid, Shamsuzzoha
    [J]. SYSTEMATIC BIOLOGY, 2021, 70 (06) : 1213 - 1231
  • [30] Inconsistency of Triplet-Based and Quartet-Based Species Tree Estimation under Intralocus Recombination
    Hill, Max
    Roch, Sebastien
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2022, 29 (11) : 1173 - 1197