Classification of conformational stability of protein mutants from 2D graph representation of protein sequences using support vector machines

被引:5
|
作者
Fernandez, M. [1 ]
Caballero, J.
Fernandez, L.
Abreu, J. I.
Acostas, G.
机构
[1] Univ Matanzas, Ctr Biotechnol Studies, Fac Agron, Mol Modelling Grp, Matanzas 44740, Cuba
[2] Univ Talca, Ctr Bioinformat & Simulac Mol, Talca, Chile
[3] Univ Matanzas, Fac Informat, Artificial Intelligence Lab, Matanzas 44740, Cuba
[4] Natl Bioinformat Ctr, Havana 10200, Cuba
关键词
protein stability prediction; point mutations; kernel-based methods; graph similarity;
D O I
10.1080/08927020701377070
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
Euclidean distance counts derived from the protein 2D graphs were used for encoding protein structural information. A total of 35 amino acid 2D distance count (AA2DC) descriptors were calculated from the Euclidean distance matrices (EDM) derived from the 2D graphs at distances ranging from 0.05 to 1.8 units with a lag of 0.05 units. AA2DC descriptors were tested for building predictive classification model of the signs of the change of thermal unfolding Gibbs free energy change (Delta Delta G) of a large data set of 2048 single point mutations on 64 proteins. A support vector machine (SVM) classifier with a Radial Basis Function kernel was implemented for classifying the conformational stability of protein mutants. Temperature and pH of the Delta Delta G experimental measurements were also conveniently used for SVM training in addition to calculated AA2DC descriptors. The optimum SVM model correctly predicted about 72% of Delta Delta G signs in crossvalidation test for all the dataset and also for stable and unstable mutant separately. To the best of our knowledge, this level of accuracy for stable mutant recognition is the highest ever reported for a predictor using sequence information. Furthermore, the classifier adequately recognized unstable mutants of human prion protein and human transthyretin associated to diseases.
引用
收藏
页码:889 / 896
页数:8
相关论文
共 50 条
  • [1] Classification of conformational stability of protein mutants from 3D pseudo-folding graph representation of protein sequences using support vector machines
    Fernandez, Michael
    Caballero, Julio
    Fernandez, Leyden
    Abreu, Jose Ignacio
    Acosta, Gianco
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2008, 70 (01) : 167 - 175
  • [2] Predicting protein stability changes from sequences using support vector machines
    Capriotti, E
    Fariselli, P
    Calabrese, R
    Casadio, R
    [J]. BIOINFORMATICS, 2005, 21 : 54 - 58
  • [3] Human papillomavirus risk type classification from protein sequences using support vector machines
    Kim, Sun
    Zhang, Young-Tak
    [J]. APPLICATIONS OF EVOLUTIONARY COMPUTING, PROCEEDINGS, 2006, 3907 : 57 - 66
  • [4] A novel representation of protein sequences for prediction of subcellular location using support vector machines
    Matsuda, S
    Vert, JP
    Saigo, H
    Ueda, N
    Toh, H
    Akutsu, T
    [J]. PROTEIN SCIENCE, 2005, 14 (11) : 2804 - 2813
  • [5] Structural Classification of Protein Sequences based on Signal Processing and Support Vector Machines
    Chrysostomou, Charalambos
    Seker, Huseyin
    [J]. 2016 38TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2016, : 3088 - 3091
  • [6] Classification of Nucleotide Sequences Using Support Vector Machines
    Seo, Tae-Kun
    [J]. JOURNAL OF MOLECULAR EVOLUTION, 2010, 71 (04) : 250 - 267
  • [7] Classification of Nucleotide Sequences Using Support Vector Machines
    Tae-Kun Seo
    [J]. Journal of Molecular Evolution, 2010, 71 : 250 - 267
  • [8] Classification of voltage-gated K+ ion channels from 3D pseudo-folding graph representation of protein sequences using genetic algorithm-optimized support vector machines
    Fernandez, Michael
    Fernandez, Leyden
    Abreu, Jose Ignacio
    Garriga, Miguel
    [J]. JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2008, 26 (08): : 1306 - 1314
  • [9] Classification of mammograms using 2D Haar wavelets, rough sets and Support Vector Machines
    Swiniarski, R
    Shin, JH
    [J]. DMIN '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON DATA MINING, 2005, : 65 - 70
  • [10] Conotoxin protein classification using free scores of words and support vector machines
    Zaki, Nazar
    Wolfsheimer, Stefan
    Nuel, Gregory
    Khuri, Sawsan
    [J]. BMC BIOINFORMATICS, 2011, 12