An Efficient Computational Intelligence Technique for Classification of Protein Sequences

被引:0
|
作者
Iqbal, Muhammad Javed [1 ]
Faye, Ibrahima [2 ]
Said, Abas Md [1 ]
Samir, Brahim Belhaouari [3 ]
机构
[1] Univ Teknol PETRONAS, Dept Comp & Informat Sci, Tronoh, Malaysia
[2] Univ Teknol PETRONAS, Dept Fundamental & Appl Sci, Tronoh, Malaysia
[3] Alfaisal Univ, Coll Sci, Riyadh, Saudi Arabia
关键词
Bioinformatics; Feature encoding; Data mining; Superfamily; Protein classification;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many artificial intelligence techniques have been developed to process the constantly increasing volume of data to extract meaningful information from it. The accurate annotation of the unknown protein using the classification of the protein sequence into an existing superfamily is considered a critical and challenging task in bioinformatics and computational biology. This classification would be helpful in the analysis and modeling of unknown protein to determine their structure and function. In this paper, a frequency-based feature encoding technique has been used in the proposed framework to represent amino acids of a protein's primary sequence. The technique has considered the occurrence frequency of each amino acid in a sequence. Popular classification algorithms such as decision tree, naive Bayes, neural network, random forest and support vector machine have been employed to evaluate the effectiveness of the encoding method utilized in the proposed framework. Results have indicated that the decision tree classifier significantly shows better results in terms of classification accuracy, specificity, sensitivity, F-measure, etc. The classification accuracy of 88.7% was achieved over the Yeast protein sequence data taken from the well-known UniProtKB database.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Complementary classification approaches for protein sequences
    Wang, JTL
    Marr, TG
    Shasha, D
    Shapiro, BA
    Chirn, GW
    Lee, TY
    PROTEIN ENGINEERING, 1996, 9 (05): : 381 - 386
  • [32] New Classification System for Protein Sequences
    Kabli, Fatima
    Hamou, Reda Mohamed
    Amine, Abdelmalek
    PROCEEDINGS OF 2017 FIRST INTERNATIONAL CONFERENCE ON EMBEDDED & DISTRIBUTED SYSTEMS (EDIS 2017), 2017, : 44 - 49
  • [33] CLASSIFICATION OF PROTEIN SEQUENCES BY THEIR DIPEPTIDE COMPOSITION
    PETRILLI, P
    COMPUTER APPLICATIONS IN THE BIOSCIENCES, 1993, 9 (02): : 205 - 209
  • [34] Computational intelligence techniques for human brain MRI classification
    El-Dahshan, El-Sayed A.
    Bassiouni, Mahmoud M.
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2018, 28 (02) : 132 - 148
  • [35] Automated Vehicle Classification with Image Processing and Computational Intelligence
    Sarikan, Selim S.
    Ozbayoglu, A. Murat
    Zilci, Oguzhan
    COMPLEX ADAPTIVE SYSTEMS CONFERENCE WITH THEME: ENGINEERING CYBER PHYSICAL SYSTEMS, CAS, 2017, 114 : 515 - 522
  • [36] COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS: CLASSIFICATION, RETRIEVAL AND VISUALIZATION
    Selvaraj, Henry
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2006, 6 (02) : III - IV
  • [37] CALLER BEHAVIOUR CLASSIFICATION USING COMPUTATIONAL INTELLIGENCE METHODS
    Patel, Pretesh B.
    Marwala, Tshilidzi
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2010, 20 (01) : 87 - 93
  • [38] Computational Prediction of Metamorphic Behavior in Protein Sequences
    Wang, Lee-Ping
    LiWang, Andy
    Chen, Nanhao
    Das, Madhurima
    Yao, Xuejun
    BIOPHYSICAL JOURNAL, 2020, 118 (03) : 23A - 24A
  • [39] Computational Prediction of Important Regions in Protein Sequences
    Pietrosemoli, Natalia
    Lopez, Daniel
    Segura-Cabrera, Aldo
    Pazos, Florencio
    IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 143 - 147
  • [40] Machine Intelligence Techniques for Protein Classification
    Satpute, Babasaheb
    Yadav, Raghav
    2018 3RD INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,