Multiple sequence alignment accuracy and phylogenetic inference

被引:172
|
作者
Ogden, TH [1 ]
Rosenberg, MS
机构
[1] Arizona State Univ, Biodesign Inst, Ctr Evolut Funct Genom, Tempe, AZ 85287 USA
[2] Arizona State Univ, Sch Life Sci, Tempe, AZ 85287 USA
关键词
Bayesian; maximum likelihood; maximum parsimony; multiple sequence alignment; neighbor joining; phylogenetics; simulation; tree reconstruction;
D O I
10.1080/10635150500541730
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Phylogenies are often thought to be more dependent upon the species of the sequence alignment rather than on the method of reconstruction. Simulation of sequences containing insertion and deletion events was performed in order to determine the role that alignment accuracy plays during phylogenetic inference. Data sets were simulated for pectinate, balanced, and random tree shapes under different conditions (ultrametric equal branch length, ultrametric random branch length, nonultrametric random branch length). Comparisons between hypothesized alignments and true alignments enabled determination of two measures of alignment accuracy, that of the total data set and that of individual branches. In general, our results indicate that as alignment error increases, topological accuracy decreases. This trend was much more pronounced for data sets derived from more pectinate topologies. In contrast, for balanced, ultrametric, equal branch length tree shapes, alignment inaccuracy had little average effect on tree reconstruction. These conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy, may recover very accurate or inaccurate topologies. Maximum likelihood and Bayesian, in general, outperformed neighbor joining and maximum parsimony in terms of tree reconstruction accuracy. Results also indicated that as the length of the branch and of the neighboring branches increase, alignment accuracy decreases, and the length of the neighboring branches is the major factor in topological accuracy. Thus, multiple-sequence alignment can be an important factor in downstream effects on topological reconstruction.
引用
收藏
页码:314 / 328
页数:15
相关论文
共 50 条
  • [21] MAFFT version 5: improvement in accuracy of multiple sequence alignment
    Katoh, K
    Kuma, K
    Toh, H
    Miyata, T
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 (02) : 511 - 518
  • [22] OXBench: A benchmark for evaluation of protein multiple sequence alignment accuracy
    Raghava, GPS
    Searle, SMJ
    Audley, PC
    Barber, JD
    Barton, GJ
    [J]. BMC BIOINFORMATICS, 2003, 4 (1)
  • [23] MUSCLE: multiple sequence alignment with high accuracy and high throughput
    Edgar, RC
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 (05) : 1792 - 1797
  • [24] Accuracy Estimation and Parameter Advising for Protein Multiple Sequence Alignment
    Kececioglu, John
    DeBlasio, Dan
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2013, 20 (04) : 259 - 279
  • [25] Improvement in speed and accuracy of multiple sequence alignment program prime
    Waseda University, Computational Biology Research Center, Japan
    不详
    不详
    [J]. IPSJ Trans. Bioinformatics, 2008, (2-12):
  • [26] TCS: a web server for multiple sequence alignment evaluation and phylogenetic reconstruction
    Chang, Jia-Ming
    Di Tommaso, Paolo
    Lefort, Vincent
    Gascuel, Olivier
    Notredame, Cedric
    [J]. NUCLEIC ACIDS RESEARCH, 2015, 43 (W1) : W3 - W6
  • [27] A Method of Alignment Masking for Refining the Phylogenetic Signal of Multiple Sequence Alignments
    Rajan, Vaibhav
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2013, 30 (03) : 689 - 712
  • [28] OXBench: A benchmark for evaluation of protein multiple sequence alignment accuracy
    GPS Raghava
    Stephen MJ Searle
    Patrick C Audley
    Jonathan D Barber
    Geoffrey J Barton
    [J]. BMC Bioinformatics, 4
  • [29] Refinement of phylogenetic signal in multiple sequence alignment: Results of simulation study
    Rusin, L. Y.
    Lyubetsky, V. A.
    [J]. PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE, VOL 3, 2006, : 222 - +
  • [30] ClipKIT: A multiple sequence alignment trimming software for accurate phylogenomic inference
    Steenwyk, Jacob L.
    Buida, Thomas J., III
    Li, Yuanning
    Shen, Xing-Xing
    Rokas, Antonis
    [J]. PLOS BIOLOGY, 2020, 18 (12)