Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features

被引:5
|
作者
Banerjee, Amit Kumar [1 ]
Ravi, Vadlamani [2 ]
Murty, U. S. N. [1 ]
Sengupta, Neelava [1 ]
Karuna, Batepatti [1 ]
机构
[1] Indian Inst Chem Technol CSIR, Div Biol, Bioinformat Grp, Hyderabad, Andhra Pradesh, India
[2] Inst Dev & Res Banking Technol IDBRT, Hyderabad, Andhra Pradesh, India
关键词
Histidine kinase; Classification; Datamining; Physicochemical property; Support vector machine; Radial basis function; MACHINE-LEARNING APPROACH; HISTIDINE KINASE; PHYSICOCHEMICAL PROPERTIES;
D O I
10.1007/s12010-013-0268-1
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Standard molecular experimental methodologies and mathematical procedures often fail to answer many phylogeny and classification related issues. Modern artificial intelligent-based techniques, such as radial basis function, genetic algorithm, artificial neural network, and support vector machines are of ample potential in this regard. Reliance on a large number of essential parameters will aid in enhanced robustness, reliability, and better accuracy as opposed to single molecular parameter. This study was conducted with dataset of computed protein physicochemical properties belonging to 20 different bacterial genera. A total of 57 sequential and structural parameters derived from protein sequences were considered for the initial classification. Feature selection based techniques were employed to find out the most important features influencing the dataset. Various amino acids, hydrophobicity, relative sulfur percentage, and codon number were selected as important parameters during the study. Comparative analyses were performed applying RapidMiner data mining platform. Support vector machine proved to be the best method with maximum accuracy of more than 91 %.
引用
收藏
页码:1263 / 1281
页数:19
相关论文
共 50 条
  • [21] CRYSpred: Accurate Sequence-Based Protein Crystallization Propensity Prediction Using Sequence-Derived Structural Characteristics
    Mizianty, Marcin J.
    Kurgan, Lukasz A.
    [J]. PROTEIN AND PEPTIDE LETTERS, 2012, 19 (01): : 40 - 49
  • [22] Prediction of protein-coding small ORFs in multi-species using integrated sequence-derived features and the random forest model
    Yu, Jiafeng
    Jiang, Wenwen
    Zhu, Sen-Bin
    Liao, Zhen
    Dou, Xianghua
    Liu, Jian
    Guo, Feng-Biao
    Dong, Chuan
    [J]. METHODS, 2023, 210 : 10 - 19
  • [23] Accurate in silico identification of species-specific acetylation sites by integrating protein sequence-derived and functional features
    Yuan Li
    Mingjun Wang
    Huilin Wang
    Hao Tan
    Ziding Zhang
    Geoffrey I. Webb
    Jiangning Song
    [J]. Scientific Reports, 4
  • [24] Prediction of neddylation sites from protein sequences and sequence-derived properties
    Ahmet Sinan Yavuz
    Namık Berk Sözer
    Osman Uğur Sezerman
    [J]. BMC Bioinformatics, 16
  • [25] Identification of protein functions using a machine-learning approach based on sequence-derived properties
    Lee, Bum Ju
    Shin, Moon Sun
    Oh, Young Joon
    Oh, Hae Seok
    Ryu, Keun Ho
    [J]. PROTEOME SCIENCE, 2009, 7
  • [26] Accurate in silico identification of species-specific acetylation sites by integrating protein sequence-derived and functional features
    Li, Yuan
    Wang, Mingjun
    Wang, Huilin
    Tan, Hao
    Zhang, Ziding
    Webb, Geoffrey I.
    Song, Jiangning
    [J]. SCIENTIFIC REPORTS, 2014, 4
  • [27] ECAmyloid: An amyloid predictor based on ensemble learning and comprehensive sequence-derived features
    Yang, Runtao
    Liu, Jiaming
    Zhang, Lina
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2023, 104
  • [28] Improving Bacterial sRNA Identification By Combining Genomic Context and Sequence-Derived Features
    Sorkhian, Mohammad
    Nagari, Megha
    Elsisy, Moustafa
    Pena-Castillo, Lourdes
    [J]. COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS, CIBB 2021, 2022, 13483 : 67 - 78
  • [29] SCAMPER: Accurate Type-Specific Prediction of Calcium-Binding Residues Using Sequence-Derived Features
    Zhang, Jian
    Zhou, Feng
    Liang, Xingchen
    Yang, Guifu
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (02) : 1406 - 1416
  • [30] Protein Sequence Structure Prediction Using Artificial Intelligent Techniques
    Upadhyay, Ved Prakash
    Panwar, Subhash
    Merugu, Ramchander
    [J]. INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION COMMUNICATION TECHNOLOGY & COMPUTING, 2016, 2016,