PROMALS: towards accurate multiple sequence alignments of distantly related proteins

被引:244
|
作者
Pei, Jimin
Grishin, Nick V.
机构
[1] Univ Texas, SW Med Ctr, Howard Hughes Med Inst, Dallas, TX 75390 USA
[2] Univ Texas, SW Med Ctr, Dept Biochem, Dallas, TX 75390 USA
关键词
D O I
10.1093/bioinformatics/btm017
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Accurate multiple sequence alignments are essential in protein structure modeling, functional prediction and efficient planning of experiments. Although the alignment problem has attracted considerable attention, preparation of high-quality alignments for distantly related sequences remains a difficult task. Results: We developed PROMALS, a multiple alignment method that shows promising results for protein homologs with sequence identity below 10%, aligning close to half of the amino acid residues correctly on average. This is about three times more accurate than traditional pairwise sequence alignment methods. PROMALS algorithm derives its strength from several sources: (i) sequence database searches to retrieve additional homologs; (ii) accurate secondary structure prediction; (iii) a hidden Markov model that uses a novel combined scoring of amino acids and secondary structures; (iv) probabilistic consistency-based scoring applied to progressive alignment of profiles. Compared to the best alignment methods that do not use secondary structure prediction and database searches (e.g. MUMMALS, ProbCons and MAFFT), PROMALS is up to 30% more accurate, with improvement being most prominent for highly divergent homologs. Compared to SPEM and HHalign, which also employ database searches and secondary structure prediction, PROMALS shows an accuracy improvement of several percent.
引用
收藏
页码:802 / 808
页数:7
相关论文
共 50 条
  • [1] PROMALS web server for accurate multiple protein sequence alignments
    Pei, Jimin
    Kim, Bong-Hyun
    Tang, Ming
    Grishin, Nick V.
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 : W649 - W652
  • [2] PROMALS3D web server for accurate multiple protein sequence and structure alignments
    Pei, Jimin
    Tang, Ming
    Grishin, Nick V.
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : W30 - W34
  • [3] Progressive combinatorial algorithm for multiple structural alignments:: Application to distantly related proteins
    Ochagavía, ME
    Wodak, H
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2004, 55 (02) : 436 - 454
  • [4] PROMALS3D: a tool for multiple protein sequence and structure alignments
    Pei, Jimin
    Kim, Bong-Hyun
    Grishin, Nick V.
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 (07) : 2295 - 2300
  • [5] Sensitive pattern discovery with 'fuzzy' alignments of distantly related proteins
    Heger, Andreas
    Holm, Liisa
    [J]. BIOINFORMATICS, 2003, 19 : i130 - i137
  • [6] Evaluation of sequence alignments of distantly related sequence pairs with respect to structural similarity
    Guerler, Aysam
    Knapp, Ernst-Walter
    [J]. GENOME INFORMATICS 2007, VOL 18, 2007, 18 : 183 - 191
  • [7] OPTIMAL PROTEIN-STRUCTURE ALIGNMENTS BY MULTIPLE LINKAGE CLUSTERING - APPLICATION TO DISTANTLY RELATED PROTEINS
    BOUTONNET, NS
    ROOMAN, MJ
    OCHAGAVIA, ME
    RICHELLE, J
    WODAK, SJ
    [J]. PROTEIN ENGINEERING, 1995, 8 (07): : 647 - 662
  • [8] Sequence Selection for Multiple Alignments of Transmembrane Proteins
    Nishio, Takuhiro
    Ohta, Teruyuki
    Kaneko, Sunao
    Shimizu, Toshio
    [J]. INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2009, 12 (01): : 235 - 242
  • [9] NdPASA: a pairwise sequence alignment server for distantly related proteins
    Li, W
    Wang, JW
    Feng, JA
    [J]. BIOINFORMATICS, 2005, 21 (19) : 3803 - 3805
  • [10] Accurate Simulation and Detection of Coevolution Signals in Multiple Sequence Alignments
    Ackerman, Sharon H.
    Tillier, Elisabeth R.
    Gatti, Domenico L.
    [J]. PLOS ONE, 2012, 7 (10):