Prediction of the functional class of metal-binding proteins from sequence derived physicochemical properties by support vector machine approach

被引:45
|
作者
Lin, H. H.
Han, L. Y.
Zhang, H. L.
Zheng, C. J.
Xie, B.
Cao, Z. W.
Chen, Y. Z.
机构
[1] Shanghai Ctr Bioinformat Technol, Shanghai 200235, Peoples R China
[2] Natl Univ Singapore, Dept Pharm, Bioinformat & Drug Design Grp, Singapore 117543, Singapore
[3] Natl Univ Singapore, Dept Computat Sci, Singapore 117543, Singapore
关键词
D O I
10.1186/1471-2105-7-S5-S13
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Metal-binding proteins play important roles in structural stability, signaling, regulation, transport, immune response, metabolism control, and metal homeostasis. Because of their functional and sequence diversity, it is desirable to explore additional methods for predicting metal-binding proteins irrespective of sequence similarity. This work explores support vector machines (SVM) as such a method. SVM prediction systems were developed by using 53,333 metal-binding and 147,347 non-metal-binding proteins, and evaluated by an independent set of 31,448 metal-binding and 79,051 non-metal-binding proteins. The computed prediction accuracy is 86.3%, 81.6%, 83.5%, 94.0%, 81.2%, 85.4%, 77.6%, 90.4%, 90.9%, 74.9% and 78.1% for calcium-binding, cobalt-binding, copper-binding, iron-binding, magnesium-binding, manganese-binding, nickel-binding, potassium-binding, sodium-binding, zinc-binding, and all metal-binding proteins respectively. The accuracy for the non-member proteins of each class is 88.2%, 99.9%, 98.1%, 91.4%, 87.9%, 94.5%, 99.2%, 99.9%, 99.9%, 98.0%, and 88.0% respectively. Comparable accuracies were obtained by using a different SVM kernel function. Our method predicts 67% of the 87 metal-binding proteins non-homologous to any protein in the Swissprot database and 85.3% of the 333 proteins of known metal-binding domains as metal-binding. These suggest the usefulness of SVM for facilitating the prediction of metal-binding proteins. Our software can be accessed at the SVMProt server http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Prediction of the functional class of metal-binding proteins from sequence derived physicochemical properties by support vector machine approach
    HH Lin
    LY Han
    HL Zhang
    CJ Zheng
    B Xie
    ZW Cao
    YZ Chen
    [J]. BMC Bioinformatics, 7
  • [2] Prediction of RNA-binding proteins from primary sequence by a support vector machine approach
    Han, LY
    Cai, CZ
    Lo, SL
    Chung, MCM
    Chen, YZ
    [J]. RNA, 2004, 10 (03) : 355 - 368
  • [3] Prediction of transmembrane proteins from their primary sequence by support vector machine approach
    Cai, C. Z.
    Yuan, Q. F.
    Xiao, H. G.
    Liu, X. H.
    Han, L. Y.
    Chen, Y. Z.
    [J]. COMPUTATIONAL INTELLIGENCE AND BIOINFORMATICS, PT 3, PROCEEDINGS, 2006, 4115 : 525 - 533
  • [4] Prediction of the functional class of lipid binding proteins from sequence-derived properties irrespective of sequence similarity
    Lin, HH
    Han, LY
    Zhang, HL
    Zheng, CJ
    Xie, B
    Chen, YZ
    [J]. JOURNAL OF LIPID RESEARCH, 2006, 47 (04) : 824 - 831
  • [5] Prediction of Functional Class of Proteins and Peptides Irrespective of Sequence Homology by Support Vector Machines
    Tang, Zhi Qun
    Lin, Hong Huang
    Zhang, Hai Lei
    Han, Lian Yi
    Chen, Xin
    Chen, Yu Zong
    [J]. BIOINFORMATICS AND BIOLOGY INSIGHTS, 2007, 1 : 19 - 47
  • [6] A machine learning approach for the identification of odorant binding proteins from sequence-derived properties
    Pugalenthi, Ganesan
    Tang, Ke
    Suganthan, P. N.
    Archunan, G.
    Sowdhamini, R.
    [J]. BMC BIOINFORMATICS, 2007, 8 (1)
  • [7] A machine learning approach for the identification of odorant binding proteins from sequence-derived properties
    Ganesan Pugalenthi
    Ke Tang
    PN Suganthan
    G Archunan
    R Sowdhamini
    [J]. BMC Bioinformatics, 8
  • [8] Prediction of transporter family from protein sequence by support vector machine approach
    Lin, HH
    Han, LY
    Cai, CZ
    Ji, ZL
    Chen, YZ
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2006, 62 (01) : 218 - 231
  • [9] Computer prediction of allergen proteins from sequence-derived protein structural and physicochemical properties
    Cui, Juan
    Han, Lian Yi
    Li, Hu
    Ung, Choong Yong
    Tang, Zhi Qun
    Zheng, Chan Juan
    Cao, Zhi Wei
    Chen, Yu Zong
    [J]. MOLECULAR IMMUNOLOGY, 2007, 44 (04) : 514 - 520
  • [10] newDNA-Prot: Prediction of DNA-binding proteins by employing support vector machine and a comprehensive sequence representation
    Zhang, Yanping
    Xu, Jun
    Zheng, Wei
    Zhang, Chen
    Qiu, Xingye
    Chen, Ke
    Ruan, Jishou
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2014, 52 : 51 - 59