Fold recognition and accurate sequence-structure alignment of sequences directing β-sheet proteins

被引:22
|
作者
McDonnell, Andrew V.
Menke, Matthew
Palmer, Nathan
King, Jonathan
Cowen, Lenore
Berger, Bonnie
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
[2] MIT, Dept Biol, Cambridge, MA 02139 USA
[3] Tufts Univ, Dept Comp Sci, Medford, MA 02155 USA
[4] MIT, Dept Math, Cambridge, MA 02139 USA
关键词
fold recognition; protein structure prediction; statistical prediction; parallel right-handed beta-helix; beta-trefoil;
D O I
10.1002/prot.20942
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The ability to predict structure from sequence is particularly important for toxins, virulence factors, allergens, cytokines, and other proteins of public health importance. Many such functions are represented in the parallel beta-helix and beta-trefoil families. A method using pairwise beta-strand interaction probabilities coupled with evolutionary information represented by sequence profiles is developed to tackle these problems for the beta-helix and beta-trefoil folds. The algorithm BetaWrapPro employs a "wrapping" component that may capture folding processes with an initiation stage followed by processive interaction of the sequence with the already-formed motifs. BetaWrapPro outperforms all previous motif recognition programs for these folds, recognizing the beta-helix with 100% sensitivity and 99.7% specificity and the beta-trefoil with 100% sensitivity and 92.5% specificity, in crossvalidation on a database of all nonredundant known positive and negative examples of these fold classes in the PDB. It additionally aligns 88% of residues for the beta-helices and 86% for the beta-trefoils accurately (within four residues of the exact positon) to the structural template, which is then used with the side-chain packing program SCWRL to produce 3D structure predictions. One striking result has been the prediction of an unexpected parallel beta-helix structure for a pollen allergen, and its recent confirmation through solution of its structure. A Web server running BetaWrapPro, is available and outputs putative PDB-style coordinates for sequences predicted to form the target folds. Proteins 2006;63:976-985. (c) 2006 Wiley-Liss, Inc.
引用
收藏
页码:976 / 985
页数:10
相关论文
共 50 条
  • [11] Local sequence-structure correlations in proteins
    Bystroff, C
    Simons, KT
    Han, KF
    Baker, D
    CURRENT OPINION IN BIOTECHNOLOGY, 1996, 7 (04) : 417 - 421
  • [12] Local sequence-structure relationships in proteins
    Skrbic, Tatjana
    Maritan, Amos
    Giacometti, Achille
    Banavar, Jayanth R.
    PROTEIN SCIENCE, 2021, 30 (04) : 818 - 829
  • [13] SEQUENCE-STRUCTURE RELATIONSHIPS IN PROTEINS AND COPOLYMERS
    YUE, KZ
    DILL, KA
    PHYSICAL REVIEW E, 1993, 48 (03): : 2267 - 2278
  • [14] Variable gap penalty for protein sequence-structure alignment
    Madhusudhan, MS
    Marti-Renom, MA
    Sanchez, R
    Sali, A
    PROTEIN ENGINEERING DESIGN & SELECTION, 2006, 19 (03): : 129 - 133
  • [15] Structure-based evaluation of sequence comparison and fold recognition alignment accuracy
    Domingues, FS
    Lackner, P
    Andreeva, A
    Sippl, MJ
    JOURNAL OF MOLECULAR BIOLOGY, 2000, 297 (04) : 1003 - 1013
  • [16] Discovering sequence-structure patterns in proteins with variable secondary structure
    Milledge, Tom
    Zheng, Gaolin
    Narasimhan, Giri
    COMPUTATIONAL SCIENCE - ICCS 2006, PT 2, PROCEEDINGS, 2006, 3992 : 702 - 709
  • [17] FragQA: predicting local fragment quality of a sequence-structure alignment
    Gao, Xin
    Bu, Dongbo
    Li, Shuai Cheng
    Xu, Jinbo
    Li, Ming
    GENOME INFORMATICS 2007, VOL 19, 2007, 19 : 27 - +
  • [18] A symmetry-related sequence-structure relation of proteins
    XU Ruizhen
    Science Bulletin, 2005, (06) : 536 - 538
  • [19] Sequence-structure analysis of FAD-containing proteins
    Dym, O
    Eisenberg, D
    PROTEIN SCIENCE, 2001, 10 (09) : 1712 - 1728
  • [20] Sequence-structure mapping errors in the PDB: OB-fold domains
    Venclovas, C
    Ginalski, K
    Kang, C
    PROTEIN SCIENCE, 2004, 13 (06) : 1594 - 1602