Multiple sequence alignment accuracy and phylogenetic inference

被引:172
|
作者
Ogden, TH [1 ]
Rosenberg, MS
机构
[1] Arizona State Univ, Biodesign Inst, Ctr Evolut Funct Genom, Tempe, AZ 85287 USA
[2] Arizona State Univ, Sch Life Sci, Tempe, AZ 85287 USA
关键词
Bayesian; maximum likelihood; maximum parsimony; multiple sequence alignment; neighbor joining; phylogenetics; simulation; tree reconstruction;
D O I
10.1080/10635150500541730
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Phylogenies are often thought to be more dependent upon the species of the sequence alignment rather than on the method of reconstruction. Simulation of sequences containing insertion and deletion events was performed in order to determine the role that alignment accuracy plays during phylogenetic inference. Data sets were simulated for pectinate, balanced, and random tree shapes under different conditions (ultrametric equal branch length, ultrametric random branch length, nonultrametric random branch length). Comparisons between hypothesized alignments and true alignments enabled determination of two measures of alignment accuracy, that of the total data set and that of individual branches. In general, our results indicate that as alignment error increases, topological accuracy decreases. This trend was much more pronounced for data sets derived from more pectinate topologies. In contrast, for balanced, ultrametric, equal branch length tree shapes, alignment inaccuracy had little average effect on tree reconstruction. These conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy, may recover very accurate or inaccurate topologies. Maximum likelihood and Bayesian, in general, outperformed neighbor joining and maximum parsimony in terms of tree reconstruction accuracy. Results also indicated that as the length of the branch and of the neighboring branches increase, alignment accuracy decreases, and the length of the neighboring branches is the major factor in topological accuracy. Thus, multiple-sequence alignment can be an important factor in downstream effects on topological reconstruction.
引用
收藏
页码:314 / 328
页数:15
相关论文
共 50 条
  • [1] Multiple sequence alignment in phylogenetic analysis
    Phillips, A
    Janies, D
    Wheeler, W
    [J]. MOLECULAR PHYLOGENETICS AND EVOLUTION, 2000, 16 (03) : 317 - 330
  • [2] Multiple sequence alignment for phylogenetic purposes
    Morrison, David A.
    [J]. AUSTRALIAN SYSTEMATIC BOTANY, 2006, 19 (06) : 479 - 539
  • [3] TCS: A New Multiple Sequence Alignment Reliability Measure to Estimate Alignment Accuracy and Improve Phylogenetic Tree Reconstruction
    Chang, Jia-Ming
    Di Tommaso, Paolo
    Notredame, Cedric
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2014, 31 (06) : 1625 - 1637
  • [4] LumberJack: a heuristic tool for sequence alignment exploration and phylogenetic inference
    Lawrence, CJ
    Zmasek, CM
    Dawe, RK
    Malmberg, RL
    [J]. BIOINFORMATICS, 2004, 20 (12) : 1977 - 1979
  • [5] Multiple sequence alignment and reconstructing phylogenetic trees with Hadoop
    Zou, Quan
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 1438 - 1438
  • [6] The Impact of Multiple Protein Sequence Alignment on Phylogenetic Estimation
    Wang, Li-San
    Leebens-Mack, Jim
    Wall, P. Kerr
    Beckmann, Kevin
    dePamphilis, Claude W.
    Warnow, Tandy
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2011, 8 (04) : 1108 - 1119
  • [7] Multiobjective Formulation of Multiple Sequence Alignment for Phylogeny Inference
    Nayeem, Muhammad Ali
    Bayzid, Md Shamsuzzoha
    Rahman, Atif Hasan
    Shahriyar, Rifat
    Rahman, M. Sohel
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (05) : 2775 - 2786
  • [8] Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference
    Linder, C. Randal
    Liu, Rahul SuriKevin
    Warnow, Tandy
    [J]. PLOS CURRENTS-TREE OF LIFE, 2010,
  • [9] The accuracy of several multiple sequence alignment programs for proteins
    Paulo AS Nuin
    Zhouzhi Wang
    Elisabeth RM Tillier
    [J]. BMC Bioinformatics, 7
  • [10] The accuracy of several multiple sequence alignment programs for proteins
    Nuin, Paulo A. S.
    Wang, Zhouzhi
    Tillier, Elisabeth R. M.
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)