Disjoint Tree Mergers for Large-Scale Maximum Likelihood Tree Estimation

被引:7
|
作者
Park, Minhyuk [1 ]
Zaharias, Paul [1 ]
Warnow, Tandy [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
phylogeny estimation; maximum likelihood; RAxML; IQ-TREE; FastTree; cox1; heterotachy; disjoint tree mergers; tree of life; SEQUENCE ALIGNMENTS; ACCURATE; NUCLEOTIDE;
D O I
10.3390/a14050148
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The estimation of phylogenetic trees for individual genes or multi-locus datasets is a basic part of considerable biological research. In order to enable large trees to be computed, Disjoint Tree Mergers (DTMs) have been developed; these methods operate by dividing the input sequence dataset into disjoint sets, constructing trees on each subset, and then combining the subset trees (using auxiliary information) into a tree on the full dataset. DTMs have been used to advantage for multi-locus species tree estimation, enabling highly accurate species trees at reduced computational effort, compared to leading species tree estimation methods. Here, we evaluate the feasibility of using DTMs to improve the scalability of maximum likelihood (ML) gene tree estimation to large numbers of input sequences. Our study shows distinct differences between the three selected ML codes-RAxML-NG, IQ-TREE 2, and FastTree 2-and shows that good DTM pipeline design can provide advantages over these ML codes on large datasets.
引用
收藏
页数:25
相关论文
共 50 条
  • [1] Maximum likelihood estimation of oncogenetic tree models
    Von Heydebreck, A
    Gunawan, B
    Füzesi, L
    [J]. BIOSTATISTICS, 2004, 5 (04) : 545 - 556
  • [2] Using Constrained-INC for Large-Scale Gene Tree and Species Tree Estimation
    Le, Thien
    Sy, Aaron
    Molloy, Erin K.
    Zhang, Qiuyi
    Rao, Satish
    Warnow, Tandy
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (01) : 2 - 15
  • [3] OPTIMIZED LARGE-SCALE CMB LIKELIHOOD AND QUADRATIC MAXIMUM LIKELIHOOD POWER SPECTRUM ESTIMATION
    Gjerlow, E.
    Colombo, L. P. L.
    Eriksen, H. K.
    Gorski, K. M.
    Gruppuso, A.
    Jewell, J. B.
    Plaszczynski, S.
    Wehus, I. K.
    [J]. ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 2015, 221 (01):
  • [4] Large Sample Approximations of Probabilities of Correct Evolutionary Tree Estimation and Biases of Maximum Likelihood Estimation
    Susko, Edward
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2011, 10 (01)
  • [5] PET: Probabilistic Estimating Tree for Large-Scale RFID Estimation
    Zheng, Yuanqing
    Li, Mo
    [J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2012, 11 (11) : 1763 - 1774
  • [6] PET: Probabilistic Estimating Tree for Large-Scale RFID Estimation
    Zheng, Yuanqing
    Li, Mo
    Qian, Chen
    [J]. 31ST INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2011), 2011, : 37 - 46
  • [7] Parallel Approximation of the Maximum Likelihood Estimation for the Prediction of Large-Scale Geostatistics Simulations
    Abdulah, Sameh
    Ltaief, Hatem
    Sun, Ying
    Genton, Marc G.
    Keyes, David E.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2018, : 98 - 108
  • [8] Maximum likelihood tree reconstruction
    von Haeseler, A
    [J]. ZOOLOGY-ANALYSIS OF COMPLEX SYSTEMS, 2000, 102 (2-3): : 101 - 110
  • [9] An Experimental Analysis of Consensus Tree Algorithms for Large-Scale Tree Collections
    Sul, Seung-Jin
    Williams, Tiffani L.
    [J]. BIOINFORMATICS RESEARCH AND APPLICATIONS: 5TH INTERNATIONAL SYMPOSIUM, ISBRA 2009, 2009, 5542 : 100 - 111
  • [10] Randomized algorithms of maximum likelihood estimation with spatial autoregressive models for large-scale networks
    Miaoqi Li
    Emily L. Kang
    [J]. Statistics and Computing, 2019, 29 : 1165 - 1179