Natural/random protein classification models based on star network topological indices

被引:48
|
作者
Munteanu, Cristian Robert [2 ]
Gonzalez-Diaz, Humberto [1 ]
Borges, Fernanda [3 ]
de Magalhaes, Alexandre Lopes [2 ]
机构
[1] Univ Santiago de Compostela, Fac Pharm, Inst Ind Pharm, UBICA, Santiago De Compostela 15782, Spain
[2] Univ Porto, Dept Chem, Fac Sci, REQUIMTE, P-4169007 Oporto, Portugal
[3] Univ Porto, Dept Organ Chem, Fac Pharm FFUP, P-4050047 Oporto, Portugal
关键词
Protein structure; Graph theory; Random proteins; !text type='Python']Python[!/text] applications; GDA; S2SNet;
D O I
10.1016/j.jtbi.2008.07.018
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The development of the complex network graphs permits us to describe any real system such as social, neural, computer or genetic networks by transforming real properties in topological indices (TIs). This work uses Randic's star networks in order to convert the protein primary structure data in specific topological indices that are used to construct a natural/random protein classification model. The set of natural proteins contains 1046 protein chains selected from the pre-compiled CulledPDB list from PISCES Dunbrack's Web Lab. This set is characterized by a protein homology of 20%, a structure resolution of 1.6 angstrom and R-factor lower than 25%. The set of random amino acid chains contains 1046 sequences which were generated by Python script according to the same type of residues and average chain length found in the natural set. A new Sequence to Star Networks (S2SNet) wxPython GUI application (with a Graphviz graphics back-end) was designed by our group in order to transform any character sequence in the following star network topological indices: Shannon entropy of Markov matrices, trace of connectivity matrices, Harary number, Wiener index, Gutman index, Schultz index, Moreau-Broto indices, Balaban distance connectivity index, Kier-Hall connectivity indices and Randic connectivity index. The model was constructed with the General Discriminant Analysis methods from STATISTICA package and gave training/predicting set accuracies of 90.77% for the forward stepwise model type. In conclusion, this study extends for the first time the classical TIs to protein star network TIs by proposing a model that can predict if a protein/fragment of protein is natural or random using only the amino acid sequence data. This classification can be used in the studies of the protein functions by changing some fragments with random amino acid sequences or to detect the fake amino acid sequences or the errors in proteins. These results promote the use of the S2SNet application not only for protein structure analysis but also for mass spectroscopy, clinical proteomics and imaging, or DNA/RNA structure analysis. (C) 2008 Elsevier Ltd. All rights reserved.
引用
收藏
页码:775 / 783
页数:9
相关论文
共 50 条
  • [1] Random Forest classification based on star graph topological indices for antioxidant proteins
    Fernandez-Blanco, Enrique
    Aguiar-Pulido, Vanessa
    Robert Munteanu, Cristian
    Dorado, Julian
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2013, 317 : 331 - 337
  • [2] Several Characterizations on Degree-Based Topological Indices for Star of David Network
    Salamat, Nadeem
    Kamran, Muhammad
    Ali, Shahbaz
    Alam, Md. Ashraful
    Khan, Riaz Hussain
    [J]. JOURNAL OF MATHEMATICS, 2021, 2021
  • [3] Computational chemistry comparison of stable/nonstable protein mutants classification models based on 3D and topological indices
    Gonzalez-Diaz, Humberto
    Perez-Castillo, Yunierkis
    Podda, Gianni
    Uriarte, Eugenio
    [J]. JOURNAL OF COMPUTATIONAL CHEMISTRY, 2007, 28 (12) : 1990 - 1995
  • [4] Eccentricity Based Topological Indices of an Oxide Network
    Imran, Muhammad
    Siddiqui, Muhammad Kamran
    Abunamous, Amna A. E.
    Adi, Dana
    Rafique, Saida Hafsa
    Baig, Abdul Qudair
    [J]. MATHEMATICS, 2018, 6 (07):
  • [5] Automatic seizure detection based on star graph topological indices
    Fernandez-Blanco, Enrique
    Rivero, Daniel
    Rabunal, Juan
    Dorado, Julian
    Pazos, Alejandro
    Robert Munteanu, Cristian
    [J]. JOURNAL OF NEUROSCIENCE METHODS, 2012, 209 (02) : 410 - 419
  • [6] On degree-based topological indices of random polyomino chains
    Sigarreta, Sayle C.
    Sigarreta, Sayli M.
    Cruz-Suarez, Hugo
    [J]. MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2022, 19 (09) : 8760 - 8773
  • [7] A Novel Protein Characterization Based on Pseudo Amino Acids Composition and Star-Like Graph Topological Indices
    He, Ping-an
    Tao, Hong
    Ma, Tingting
    Dai, Qi
    Yao, Yuhua
    [J]. COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2017, 20 (04) : 328 - 337
  • [8] Non-linear models based on simple topological indices to identify RNase III protein members
    Agueero-Chapin, Guillermin
    de la Riva, Gustavo A.
    Molina-Ruiz, Reinaldo
    Sanchez-Rodriguez, Aminael
    Perez-Machado, Gisselle
    Vasconcelos, Vitor
    Antunes, Agostinho
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2011, 273 (01) : 167 - 178
  • [9] Expected values of sum-based topological indices of random cyclodecane chains
    Tang, Jiang-Hua
    Yousaf, Shamaila
    Javaid Ashraf, Maryam
    Tawfiq, Ferdous M. O.
    Aslam, Adnan
    [J]. PHYSICA SCRIPTA, 2024, 99 (03)
  • [10] On physical analysis of topological indices via curve fitting for natural polymer of cellulose network
    Huang, Rongbing
    Siddiqui, Muhammad Kamran
    Manzoor, Shazia
    Khalid, Sadia
    Almotairi, Sultan
    [J]. EUROPEAN PHYSICAL JOURNAL PLUS, 2022, 137 (03):