TreeMerge: a new method for improving the scalability of species tree estimation methods

被引:11
|
作者
Molloy, Erin K. [1 ]
Warnow, Tandy [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
PHYLOGENY INFERENCE; GENE; ACCURATE; DUPLICATIONS; LIKELIHOOD;
D O I
10.1093/bioinformatics/btz344
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation At RECOMB-CG 2018, we presented NJMerge and showed that it could be used within a divide-and-conquer framework to scale computationally intensive methods for species tree estimation to larger datasets. However, NJMerge has two significant limitations: it can fail to return a tree and, when used within the proposed divide-and-conquer framework, has O(n(5)) running time for datasets with n species. Results Here we present a new method called TreeMerge' that improves on NJMerge in two ways: it is guaranteed to return a tree and it has dramatically faster running time within the same divide-and-conquer framework-only O(n(2)) time. We use a simulation study to evaluate TreeMerge in the context of multi-locus species tree estimation with two leading methods, ASTRAL-III and RAxML. We find that the divide-and-conquer framework using TreeMerge has a minor impact on species tree accuracy, dramatically reduces running time, and enables both ASTRAL-III and RAxML to complete on datasets (that they would otherwise fail on), when given 64 GB of memory and 48h maximum running time. Thus, TreeMerge is a step toward a larger vision of enabling researchers with limited computational resources to perform large-scale species tree estimation, which we call Phylogenomics for All. Availability and implementation TreeMerge is publicly available on Github (http://github.com/ekmolloy/treemerge). Supplementary information Supplementary data are available at Bioinformatics online.
引用
收藏
页码:I417 / I426
页数:10
相关论文
共 50 条
  • [21] The performance of coalescent-based species tree estimation methods under models of missing data
    Nute, Michael
    Chou, Jed
    Molloy, Erin K.
    Warnow, Tandy
    [J]. BMC GENOMICS, 2018, 19
  • [22] The performance of coalescent-based species tree estimation methods under models of missing data
    Michael Nute
    Jed Chou
    Erin K. Molloy
    Tandy Warnow
    [J]. BMC Genomics, 19
  • [23] SPECIES TREE ESTIMATION UNDER JOINT MODELING OF COALESCENCE AND DUPLICATION: SAMPLE COMPLEXITY OF QUARTET METHODS
    Hill, Max
    Legried, Brandon
    Roch, Sebastien
    [J]. ANNALS OF APPLIED PROBABILITY, 2022, 32 (06): : 4681 - 4705
  • [24] Improving parallel scalability for edge plasma transport simulations with neutral gas species
    McCourt, M.
    Rognlien, T.D.
    McInnes, L.C.
    Zhang, H.
    [J]. Computational Science and Discovery, 2012, 5 (01)
  • [25] METHOD OF IMPROVING ESTIMATION OF SPECTRA
    WALTER, DO
    [J]. ELECTROENCEPHALOGRAPHY AND CLINICAL NEUROPHYSIOLOGY, 1975, 38 (05): : 549 - 549
  • [26] Improving Scalability Using Hybrid Asynchronous Methods For Non-Hermitian Eigenproblems
    Dubois, Jerome
    Calvin, Christophe
    Petiton, Serge
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS), 2011, 4 : 222 - 230
  • [27] Improving robustness and parallel scalability of Newton method through nonlinear preconditioning
    Hwang, FN
    Cai, XC
    [J]. DOMAIN DECOMPOSITION METHODS IN SCIENCE AND ENGINEERING, 2005, 40 : 201 - 208
  • [28] Species tree estimation using Neighbor Joining
    Rusinko, Joseph
    McPartlon, Matthew
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2017, 414 : 5 - 7
  • [29] The Impact of Missing Data on Species Tree Estimation
    Xi, Zhenxiang
    Liu, Liang
    Davis, Charles C.
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2016, 33 (03) : 838 - 860
  • [30] Couplet Supertree Based Species Tree Estimation
    Bhattacharyya, Sourya
    Mukhopadhyay, Jayanta
    [J]. BIOINFORMATICS RESEARCH AND APPLICATIONS (ISBRA 2015), 2015, 9096 : 48 - 59