Multiple Sequence Alignment Averaging Improves Phylogeny Reconstruction

被引:22
|
作者
Ashkenazy, Haim [1 ]
Sela, Itamar [2 ]
Karin, Eli Levy [1 ,3 ]
Landan, Giddy [4 ]
Pupko, Tal [1 ]
机构
[1] Tel Aviv Univ, George S Wise Fac Life Sci, Dept Cell Res & Immunol, IL-69978 Tel Aviv, Israel
[2] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA
[3] Tel Aviv Univ, George S Wise Fac Life Sci, Dept Mol Biol & Ecol Plants, IL-69978 Tel Aviv, Israel
[4] Christian Albrechts Univ Kiel, Inst Microbiol, D-24118 Kiel, Germany
基金
欧洲研究理事会;
关键词
Alignment reliability; multiple sequence alignment; phylogeny; tree reconstruction; JOINT BAYESIAN-ESTIMATION; PROTEIN-SEQUENCE; MODEL; UNCERTAINTY; INFERENCE; EVOLUTION; ACCURACY; MAFFT; PERFORMANCE; CHALLENGES;
D O I
10.1093/sysbio/syy036
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The classic methodology of inferring a phylogenetic tree from sequence data is composed of two steps. First, a multiple sequence alignment (MSA) is computed. Then, a tree is reconstructed assuming the MSA is correct. Yet, inferred MSAs were shown to be inaccurate and alignment errors reduce tree inference accuracy. It was previously proposed that filtering unreliable alignment regions can increase the accuracy of tree inference. However, it was also demonstrated that the benefit of this filtering is often obscured by the resulting loss of phylogenetic signal. In this work we explore an approach, in which instead of relying on a single MSA, we generate a large set of alternative MSAs and concatenate them into a single SuperMSA. By doing so, we account for phylogenetic signals contained in columns that are not present in the single MSA computed by alignment algorithms. Using simulations, we demonstrate that this approach results, on average, in more accurate trees compared to 1) using an unfiltered MSA and 2) using a single MSA with weights assigned to columns according to their reliability. Next, we explore in which regions of the MSA space our approach is expected to be beneficial. Finally, we provide a simple criterion for deciding whether or not the extra effort of computing a SuperMSA and inferring a tree from it is beneficial. Based on these assessments, we expect our methodology to be useful for many cases in which diverged sequences are analyzed. The option to generate such a SuperMSA is available at ext-link-type="uri" xlink:href="http://guidance.tau.ac.il">http://guidance.tau.ac.il.
引用
收藏
页码:117 / 130
页数:14
相关论文
共 50 条
  • [1] Simultaneous phylogeny reconstruction and multiple sequence alignment
    Yue, Feng
    Shi, Jian
    Tang, Jijun
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [2] Simultaneous phylogeny reconstruction and multiple sequence alignment
    Feng Yue
    Jian Shi
    Jijun Tang
    [J]. BMC Bioinformatics, 10 (Suppl 1)
  • [3] HandAlign: Bayesian multiple sequence alignment, phylogeny and ancestral reconstruction
    Westesson, Oscar
    Barquist, Lars
    Holmes, Ian
    [J]. BIOINFORMATICS, 2012, 28 (08) : 1170 - 1171
  • [4] Parametric multiple sequence alignment and phylogeny construction
    Fernández-Baca, D
    Seppäläinen, T
    Slutzki, G
    [J]. COMBINATORIAL PATTERN MATCHING, 2000, 1848 : 69 - 83
  • [5] Parametric multiple sequence alignment and phylogeny construction
    Fernández-Baca, David
    Seppäläinen, Timo
    Slutzki, Giora
    [J]. Journal of Discrete Algorithms, 2004, 2 (2 SPEC. ISS.) : 271 - 287
  • [6] Simultaneous statistical multiple alignment and phylogeny reconstruction
    Fleissner, R
    Metzler, D
    Von Haeseler, A
    [J]. SYSTEMATIC BIOLOGY, 2005, 54 (04) : 548 - 561
  • [7] Multiobjective Formulation of Multiple Sequence Alignment for Phylogeny Inference
    Nayeem, Muhammad Ali
    Bayzid, Md Shamsuzzoha
    Rahman, Atif Hasan
    Shahriyar, Rifat
    Rahman, M. Sohel
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (05) : 2775 - 2786
  • [8] Is multiple-sequence alignment required for accurate inference of phylogeny?
    Hohl, Michael
    Ragan, Mark A.
    [J]. SYSTEMATIC BIOLOGY, 2007, 56 (02) : 206 - 221
  • [9] Large scale multiple sequence alignment with simultaneous phylogeny inference
    Parmentier, Gilles
    Trystram, Denis
    Zola, Jaroslaw
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2006, 66 (12) : 1534 - 1545
  • [10] A Fast Algorithm for Reconstructing Multiple Sequence Alignment and Phylogeny Simultaneously
    Ng, Chi-Tim
    Li, Chun
    Fan, Xiaodan
    [J]. CURRENT BIOINFORMATICS, 2017, 12 (04) : 329 - 348