Spectral cluster supertree: fast and statistically robust merging of rooted phylogenetic trees

被引:0
|
作者
Mcarthur, Robert N. [1 ]
Zehmakan, Ahad N. [2 ]
Charleston, Michael A. [3 ]
Lin, Yu [2 ]
Huttley, Gavin [1 ]
机构
[1] Australian Natl Univ, Res Sch Biol, Canberra, ACT, Australia
[2] Australian Natl Univ, Sch Comp, Canberra, ACT, Australia
[3] Univ Tasmania, Sch Nat Sci, Hobart, Tas, Australia
关键词
supertree; spectral clustering; rooted phylogenetic trees; phylogenetics; molecular evolution; INFERENCE; DISTANCE;
D O I
10.3389/fmolb.2024.1432495
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The algorithms for phylogenetic reconstruction are central to computational molecular evolution. The relentless pace of data acquisition has exposed their poor scalability and the conclusion that the conventional application of these methods is impractical and not justifiable from an energy usage perspective. Furthermore, the drive to improve the statistical performance of phylogenetic methods produces increasingly parameter-rich models of sequence evolution, which worsens the computational performance. Established theoretical and algorithmic results identify supertree methods as critical to divide-and-conquer strategies for improving scalability of phylogenetic reconstruction. Of particular importance is the ability to explicitly accommodate rooted topologies. These can arise from the more biologically plausible non-stationary models of sequence evolution. We make a contribution to addressing this challenge with Spectral Cluster Supertree, a novel supertree method for merging a set of overlapping rooted phylogenetic trees. It offers significant improvements over Min-Cut supertree and previous state-of-the-art methods in terms of both time complexity and overall topological accuracy, particularly for problems of large size. We perform comparisons against Min-Cut supertree and Bad Clade Deletion. Leveraging two tree topology distance metrics, we demonstrate that while Bad Clade Deletion generates more correct clades in its resulting supertree, Spectral Cluster Supertree's generated tree is generally more topologically close to the true model tree. Over large datasets containing 10,000 taxa and similar to 500 source trees, where Bad Clade Deletion usually takes similar to 2 h to run, our method generates a supertree in on average 20 s. Spectral Cluster Supertree is released under an open source license and is available on the python package index as sc-supertree.
引用
收藏
页数:15
相关论文
共 5 条
  • [1] Cluster Matching Distance for Rooted Phylogenetic Trees
    Moon, Jucheol
    Eulenstein, Oliver
    BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2018, 2018, 10847 : 321 - 332
  • [2] Fast Compatibility Testing for Rooted Phylogenetic Trees
    Deng, Yun
    Fernandez-Baca, David
    ALGORITHMICA, 2018, 80 (08) : 2453 - 2477
  • [3] Fast Compatibility Testing for Rooted Phylogenetic Trees
    Yun Deng
    David Fernández-Baca
    Algorithmica, 2018, 80 : 2453 - 2477
  • [4] A partial order and cluster-similarity metric on rooted phylogenetic trees
    Michael Hendriksen
    Andrew Francis
    Journal of Mathematical Biology, 2020, 80 : 1265 - 1290
  • [5] A partial order and cluster-similarity metric on rooted phylogenetic trees
    Hendriksen, Michael
    Francis, Andrew
    JOURNAL OF MATHEMATICAL BIOLOGY, 2020, 80 (05) : 1265 - 1290