Computer prediction of allergen proteins from sequence-derived protein structural and physicochemical properties

被引:59
|
作者
Cui, Juan
Han, Lian Yi
Li, Hu
Ung, Choong Yong
Tang, Zhi Qun
Zheng, Chan Juan
Cao, Zhi Wei
Chen, Yu Zong
机构
[1] Natl Univ Singapore, Dept Pharm & Computat Sci, Bioinformat & Drug Design Grp, Singapore 117543, Singapore
[2] Shanghai Ctr Bioinformat Technol, Shanghai 200235, Peoples R China
关键词
allergen; immunology; statistical learning method; support vector machine;
D O I
10.1016/j.molimm.2006.02.010
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background: Computational methods have been developed for predicting allergen proteins from sequence segments that show identity, homology, or motif match to a known allergen. These methods achieve good prediction accuracies, but are less effective for novel proteins with no similarity to any known allergen. Methods: This work tests the feasibility of using a statistical learning method, support vector machines, as such a method. The prediction system is trained and tested by using 1005 allergen proteins from the Allergome database and 22,469 non-allergen proteins from 7871 Pfam families. Results: Testing results by an independent set of 229 allergen and 6717 non-allergen proteins from 7871 Pfam families show that 93.0% and 99.9% of these are correctly predicted, which are comparable to the best results of other methods. Of the 18 novel allergen proteins non-homologous to any other proteins in the Swissprot database, 88.9% is correctly predicted. A further screening of 168,128 proteins in the Swissprot database finds that 2.9% of the proteins are predicted as allergen proteins, which is consistent with the estimated numbers from motif-based methods. Conclusions: Our study suggests that SVM is a potentially useful method for predicting allergen proteins and it has certain capability for predicting novel allergen proteins. Our software can be accessed at http://jing.cz3.nus.edu.sg/cgi-bin/APPEL. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:514 / 520
页数:7
相关论文
共 50 条
  • [1] Prediction of MHC-binding peptides of flexible lengths from sequence-derived structural and physicochemical properties
    Cui, J.
    Han, L. Y.
    Lin, H. H.
    Zhang, H. L.
    Tang, Z. Q.
    Zheng, C. J.
    Cao, Z. W.
    Chen, Y. Z.
    [J]. MOLECULAR IMMUNOLOGY, 2007, 44 (05) : 866 - 877
  • [2] Prediction of neddylation sites from protein sequences and sequence-derived properties
    Ahmet Sinan Yavuz
    Namık Berk Sözer
    Osman Uğur Sezerman
    [J]. BMC Bioinformatics, 16
  • [3] Prediction of antibiotic resistance proteins from sequence-derived properties irrespective of sequence similarity
    Zhang, H. L.
    Lin, H. H.
    Tao, L.
    Ma, X. H.
    Dai, J. L.
    Jia, J.
    Cao, Z. W.
    [J]. INTERNATIONAL JOURNAL OF ANTIMICROBIAL AGENTS, 2008, 32 (03) : 221 - 226
  • [4] Prediction of neddylation sites from protein sequences and sequence-derived properties
    Yavuz, Ahmet Sinan
    Sozer, Namik Berk
    Sezerman, Osman Ugur
    [J]. BMC BIOINFORMATICS, 2015, 16
  • [5] Prediction of the functional class of lipid binding proteins from sequence-derived properties irrespective of sequence similarity
    Lin, HH
    Han, LY
    Zhang, HL
    Zheng, CJ
    Xie, B
    Chen, YZ
    [J]. JOURNAL OF LIPID RESEARCH, 2006, 47 (04) : 824 - 831
  • [6] Complementing sequence-derived features with structural information extracted from fragment libraries for protein structure prediction
    Siyuan Liu
    Tong Wang
    Qijiang Xu
    Bin Shao
    Jian Yin
    Tie-Yan Liu
    [J]. BMC Bioinformatics, 22
  • [7] Complementing sequence-derived features with structural information extracted from fragment libraries for protein structure prediction
    Liu, Siyuan
    Wang, Tong
    Xu, Qijiang
    Shao, Bin
    Yin, Jian
    Liu, Tie-Yan
    [J]. BMC BIOINFORMATICS, 2021, 22 (01)
  • [8] CRYSpred: Accurate Sequence-Based Protein Crystallization Propensity Prediction Using Sequence-Derived Structural Characteristics
    Mizianty, Marcin J.
    Kurgan, Lukasz A.
    [J]. PROTEIN AND PEPTIDE LETTERS, 2012, 19 (01): : 40 - 49
  • [9] Prediction of Spontaneous Protein Deamidation from Sequence-Derived Secondary Structure and Intrinsic Disorder
    Lorenzo, J. Ramiro
    Alonso, Leonardo G.
    Sanchez, Ignacio E.
    [J]. PLOS ONE, 2015, 10 (12):
  • [10] A machine learning approach for the identification of odorant binding proteins from sequence-derived properties
    Ganesan Pugalenthi
    Ke Tang
    PN Suganthan
    G Archunan
    R Sowdhamini
    [J]. BMC Bioinformatics, 8