TreeMerge: a new method for improving the scalability of species tree estimation methods

被引:11
|
作者
Molloy, Erin K. [1 ]
Warnow, Tandy [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
PHYLOGENY INFERENCE; GENE; ACCURATE; DUPLICATIONS; LIKELIHOOD;
D O I
10.1093/bioinformatics/btz344
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation At RECOMB-CG 2018, we presented NJMerge and showed that it could be used within a divide-and-conquer framework to scale computationally intensive methods for species tree estimation to larger datasets. However, NJMerge has two significant limitations: it can fail to return a tree and, when used within the proposed divide-and-conquer framework, has O(n(5)) running time for datasets with n species. Results Here we present a new method called TreeMerge' that improves on NJMerge in two ways: it is guaranteed to return a tree and it has dramatically faster running time within the same divide-and-conquer framework-only O(n(2)) time. We use a simulation study to evaluate TreeMerge in the context of multi-locus species tree estimation with two leading methods, ASTRAL-III and RAxML. We find that the divide-and-conquer framework using TreeMerge has a minor impact on species tree accuracy, dramatically reduces running time, and enables both ASTRAL-III and RAxML to complete on datasets (that they would otherwise fail on), when given 64 GB of memory and 48h maximum running time. Thus, TreeMerge is a step toward a larger vision of enabling researchers with limited computational resources to perform large-scale species tree estimation, which we call Phylogenomics for All. Availability and implementation TreeMerge is publicly available on Github (http://github.com/ekmolloy/treemerge). Supplementary information Supplementary data are available at Bioinformatics online.
引用
收藏
页码:I417 / I426
页数:10
相关论文
共 50 条
  • [1] Constrained incremental tree building: new absolute fast converging phylogeny estimation methods with improved scalability and accuracy
    Qiuyi Zhang
    Satish Rao
    Tandy Warnow
    [J]. Algorithms for Molecular Biology, 14
  • [2] Constrained incremental tree building: new absolute fast converging phylogeny estimation methods with improved scalability and accuracy
    Zhang, Qiuyi
    Rao, Satish
    Warnow, Tandy
    [J]. ALGORITHMS FOR MOLECULAR BIOLOGY, 2019, 14 (1)
  • [3] The Accuracy of Species Tree Estimation under Simulation: A Comparison of Methods
    Leache, Adam D.
    Rannala, Bruce
    [J]. SYSTEMATIC BIOLOGY, 2011, 60 (02) : 126 - 137
  • [4] New Heuristic Methods for Joint Species Delimitation and Species Tree Inference
    O'Meara, Brian C.
    [J]. SYSTEMATIC BIOLOGY, 2010, 59 (01) : 59 - 73
  • [5] Improving the Efficiency and Scalability of Standard Methods for Data Cryptography
    Abu-Faraj, Mua'ad M.
    Alqadi, Ziad A.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (12): : 451 - 458
  • [6] To Include or Not to Include: The Impact of Gene Filtering on Species Tree Estimation Methods
    Molloy, Erin K.
    Warnow, Tandy
    [J]. SYSTEMATIC BIOLOGY, 2018, 67 (02) : 285 - 303
  • [7] TreeMerge: A Visual Comparative Analysis Method for Food Classification Tree in Pesticide Residue Maximum Limit Standards
    Luo, Zhiying
    Chen, Yi
    Li, Hanqiang
    Li, Yue
    Guo, Yandi
    [J]. AGRONOMY-BASEL, 2022, 12 (12):
  • [8] On the Robustness to Gene Tree Estimation Error (or lack thereof) of Coalescent-Based Species Tree Methods
    Roch, Sebastien
    Warnow, Tandy
    [J]. SYSTEMATIC BIOLOGY, 2015, 64 (04) : 663 - 676
  • [9] Phylogenetic Tree Estimation With and Without Alignment: New Distance Methods and Benchmarking
    Bogusz, Marcin
    Whelan, Simon
    [J]. SYSTEMATIC BIOLOGY, 2017, 66 (02) : 218 - 231
  • [10] Extending the Dormant Bud Cryopreservation Method to New Tree Species
    Jenderek, M. M.
    Ambruzs, B.
    Tanner, J.
    Holman, G.
    Ledbetter, C.
    Postman, J.
    Ellis, D.
    Leslie, C.
    [J]. II INTERNATIONAL SYMPOSIUM ON PLANT CRYOPRESERVATION, 2014, 1039 : 133 - 136