SMpred: A Support Vector Machine Approach to Identify Structural Motifs in Protein Structure Without Using Evolutionary Information

被引:4
|
作者
Pugalenthi, Ganesan [3 ]
Kandaswamy, Krishna Kumar [4 ,5 ]
Suganthan, P. N. [1 ]
Sowdhamini, R. [2 ]
Martinetz, Thomas [4 ]
Kolatkar, Prasanna R. [3 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] Univ Agr Sci Bangalore, Natl Ctr Biol Sci, Bangalore 560065, Karnataka, India
[3] Genome Inst Singapore, Lab Struct Biochem, Singapore 138672, Singapore
[4] Univ Lubeck, Inst Neuro & Bioinformat, D-23538 Lubeck, Germany
[5] Univ Lubeck, Grad Sch Comp Med & Life Sci, D-23538 Lubeck, Germany
来源
关键词
Protein folding; Structural motifs; Support vector machine; Fingerprint; Protein function; FUNCTIONAL DOMAIN COMPOSITION; MYCOBACTERIUM-TUBERCULOSIS; SEQUENCE MOTIFS; CLEAVAGE SITES; RNA-POLYMERASE; PREDICTION; SUPERFAMILIES; DATABASE; CLASSIFICATION; IDENTIFICATION;
D O I
10.1080/07391102.2010.10507369
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Knowledge of three dimensional structure is essential to understand the function of a protein. Although the overall fold is made from the whole details of its sequence, a small group of residues, often called as structural motifs, play a crucial role in determining the protein fold and its stability. Identification of such structural motifs requires sufficient number of sequence and structural homologs to define conservation and evolutionary information. Unfortunately, there are many structures in the protein structure databases have no homologous structures or sequences. In this work, we report an SVM method, SMpred, to identify structural motifs from single protein structure without using sequence and structural homologs. SMpred method was trained and tested using 132 proteins domains containing 581 motifs. SMpred method achieved 78.79% accuracy with 79.06% sensitivity and 78.53% specificity. The performance of SMpred was evaluated with MegaMotifBase using 188 proteins containing 1161 motifs. Out of 1161 motifs, SMpred correctly identified 1503 structural motifs reported in MegaMotifBase. Further, we showed that SMpred is useful approach for the length deviant superfamilies and single member superfamilies. This result suggests the usefulness of our approach for facilitating the identification of structural motifs in protein structure in the absence of sequence and structural homologs. The dataset and executable for the SMpred algorithm is available at http://www3.ntu.edu.sg/home/EPNSugan/index_files/SMpred.htm.
引用
收藏
页码:405 / 414
页数:10
相关论文
共 50 条
  • [1] Granular support vector machine to identify unknown structural classes of protein
    Hassan, Rohayanti
    Othman, Razib M.
    Shah, Zuraini A.
    [J]. INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 12 (04) : 451 - 467
  • [2] A new approach to identify fingerprint using support vector machine
    Trung, Nguyen Thanh
    Thao, Tran Duy
    Trung, Pham Nam
    Triet, Tran Minh
    [J]. 2006 International Conference on Computational Intelligence and Security, Pts 1 and 2, Proceedings, 2006, : 168 - 171
  • [3] Support Vector Machine-based method for predicting subcellular localization of mycobacterial proteins using evolutionary information and motifs
    Rashid, Mamoon
    Saha, Sudipto
    Raghava, Gajendra P. S.
    [J]. BMC BIOINFORMATICS, 2007, 8 (1)
  • [4] Support Vector Machine-based method for predicting subcellular localization of mycobacterial proteins using evolutionary information and motifs
    Mamoon Rashid
    Sudipto Saha
    Gajendra PS Raghava
    [J]. BMC Bioinformatics, 8
  • [5] Prediction of Protein Structural Class Using a Combined Representation of Protein-sequence Information and Support Vector Machine
    Wu, Li
    Dai, Qi
    Han, Bin
    Zhu, Lei
    Li, Lihua
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2010, : 101 - 106
  • [6] Using Support Vector Machine and Evolutionary Profiles to Predict Antifreeze Protein Sequences
    Zhao, Xiaowei
    Ma, Zhiqiang
    Yin, Minghao
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2012, 13 (02) : 2196 - 2207
  • [7] Structural protein fold recognition based on secondary structure and evolutionary information using machine learning algorithms
    Qin, Xinyi
    Liu, Min
    Zhang, Lu
    Liu, Guangzhong
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2021, 91
  • [8] Identification of catalytic residues from protein structure using support vector machine with sequence and structural features
    Pugalenthi, Ganesan
    Kumar, K. Krishna
    Suganthan, P. N.
    Gangal, Rajeev
    [J]. BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2008, 367 (03) : 630 - 634
  • [9] Rockburst prediction using evolutionary support vector machine
    Zhao, HB
    [J]. PROGRESS IN SAFETY SCIENCE AND TECHNOLOGY, VOL V, PTS A AND B, 2005, 5 : 494 - 498
  • [10] Prediction of protein secondary structure content using support vector machine
    Chen, Chao
    Tian, Yuanxin
    Zou, Xiaoyong
    Cai, Peixiang
    Mo, Jinyuan
    [J]. TALANTA, 2007, 71 (05) : 2069 - 2073