A comparison of scoring functions for protein sequence profile alignment

被引:81
|
作者
Edgar, RC [1 ]
Sjölander, K [1 ]
机构
[1] Univ Calif Berkeley, Dept Bioengn, Berkeley, CA 94720 USA
关键词
D O I
10.1093/bioinformatics/bth090
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation:In recent years, several methods have been proposed for aligning two protein sequence profiles, with reported improvements in alignment accuracy and homolog discrimination versus sequence-sequence methods (e.g. BLAST) and profile-sequence methods (e.g. PSI-BLAST). Profile-profile alignment is also the iterated step in progressive multiple sequence alignment algorithms such as CLUSTALW. However, little is known about the relative performance of different profile-profile scoring functions. In this work, we evaluate the alignment accuracy of 23 different profile-profile scoring functions by comparing alignments of 488 pairs of sequences with identity less than or equal to30% against structural alignments. We optimize parameters for all scoring functions on the same training set and use profiles of alignments from both PSI-BLAST and SAM-T99. Structural alignments are constructed from a consensus between the FSSP database and CE structural aligner. We compare the results with sequence-sequence and sequence-profile methods, including BLAST and PSI-BLAST. Results: We find that profile-profile alignment gives an average improvement over our test set of typically 2-3% over profile-sequence alignment and similar to40% over sequence-sequence alignment. No statistically significant difference is seen in the relative performance of most of the scoring functions tested. Significantly better results are obtained with profiles constructed from SAM-T99 alignments than from PSI-BLAST alignments.
引用
收藏
页码:1301 / 1308
页数:8
相关论文
共 50 条