SMpred: A Support Vector Machine Approach to Identify Structural Motifs in Protein Structure Without Using Evolutionary Information

被引:4
|
作者
Pugalenthi, Ganesan [3 ]
Kandaswamy, Krishna Kumar [4 ,5 ]
Suganthan, P. N. [1 ]
Sowdhamini, R. [2 ]
Martinetz, Thomas [4 ]
Kolatkar, Prasanna R. [3 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] Univ Agr Sci Bangalore, Natl Ctr Biol Sci, Bangalore 560065, Karnataka, India
[3] Genome Inst Singapore, Lab Struct Biochem, Singapore 138672, Singapore
[4] Univ Lubeck, Inst Neuro & Bioinformat, D-23538 Lubeck, Germany
[5] Univ Lubeck, Grad Sch Comp Med & Life Sci, D-23538 Lubeck, Germany
来源
关键词
Protein folding; Structural motifs; Support vector machine; Fingerprint; Protein function; FUNCTIONAL DOMAIN COMPOSITION; MYCOBACTERIUM-TUBERCULOSIS; SEQUENCE MOTIFS; CLEAVAGE SITES; RNA-POLYMERASE; PREDICTION; SUPERFAMILIES; DATABASE; CLASSIFICATION; IDENTIFICATION;
D O I
10.1080/07391102.2010.10507369
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Knowledge of three dimensional structure is essential to understand the function of a protein. Although the overall fold is made from the whole details of its sequence, a small group of residues, often called as structural motifs, play a crucial role in determining the protein fold and its stability. Identification of such structural motifs requires sufficient number of sequence and structural homologs to define conservation and evolutionary information. Unfortunately, there are many structures in the protein structure databases have no homologous structures or sequences. In this work, we report an SVM method, SMpred, to identify structural motifs from single protein structure without using sequence and structural homologs. SMpred method was trained and tested using 132 proteins domains containing 581 motifs. SMpred method achieved 78.79% accuracy with 79.06% sensitivity and 78.53% specificity. The performance of SMpred was evaluated with MegaMotifBase using 188 proteins containing 1161 motifs. Out of 1161 motifs, SMpred correctly identified 1503 structural motifs reported in MegaMotifBase. Further, we showed that SMpred is useful approach for the length deviant superfamilies and single member superfamilies. This result suggests the usefulness of our approach for facilitating the identification of structural motifs in protein structure in the absence of sequence and structural homologs. The dataset and executable for the SMpred algorithm is available at http://www3.ntu.edu.sg/home/EPNSugan/index_files/SMpred.htm.
引用
收藏
页码:405 / 414
页数:10
相关论文
共 50 条
  • [11] Prediction of Protein Structural Classes based on Secondary Structure Sequence using Improved Support Vector Machine (ISVM)
    Manikandan, P.
    Ramyachitra, D.
    [J]. 2016 IEEE UTTAR PRADESH SECTION INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND ELECTRONICS ENGINEERING (UPCON), 2016, : 525 - 530
  • [12] Web usage mining using evolutionary support vector machine
    Jun, SH
    [J]. AI 2005: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2005, 3809 : 1015 - 1020
  • [13] Prediction of Protein Secondary Structure using Support Vector Machine with PSSM Profiles
    Wang, Yanchun
    Cheng, Jinyong
    Liu, Yihui
    Chen, Yehong
    [J]. 2016 IEEE INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2016, : 502 - 505
  • [14] Classification of protein quaternary structure with support vector machine
    Zhang, SW
    Pan, Q
    Zhang, HC
    Zhang, YL
    Wang, HY
    [J]. BIOINFORMATICS, 2003, 19 (18) : 2390 - 2396
  • [15] Protein secondary structure prediction with high accuracy using Support Vector Machine
    Shoyaib, Mohammad
    Baker, Syed Murtuza
    Jabid, Taskeed
    Anwar, Firoz
    Khan, Haseena
    [J]. PROCEEDINGS OF 10TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT 2007), 2007, : 99 - +
  • [16] An evolutionary approach for optimizing content-based image retrieval using a support vector machine
    Kanimozhi, T.
    Latha, K.
    [J]. SCIENCEASIA, 2016, 42 : 34 - 41
  • [17] LOCUSTRA: Accurate prediction of local protein structure using a two-layer support vector machine approach
    Zimmermann, Olav
    Hansmann, Ulrich H. E.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2008, 48 (09) : 1903 - 1908
  • [18] Proximal support vector machine using local information
    Yang, Xubing
    Chen, Songcan
    Chen, Bin
    Pan, Zhisong
    [J]. NEUROCOMPUTING, 2009, 73 (1-3) : 357 - 365
  • [19] Prediction of Protein Structural Classes Using the Theory of Increment of Diversity and Support Vector Machine
    WANG Fangping1
    2.College of Sciences and Technology
    [J]. Wuhan University Journal of Natural Sciences, 2011, 16 (03) : 260 - 264
  • [20] Purely Structural Protein Scoring Functions Using Support Vector Machine and Ensemble Learning
    Mirzaei, Shokoufeh
    Sidi, Tomer
    Keasar, Chen
    Crivelli, Silvia
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2019, 16 (05) : 1515 - 1523