An Efficient Computational Intelligence Technique for Classification of Protein Sequences

被引:0
|
作者
Iqbal, Muhammad Javed [1 ]
Faye, Ibrahima [2 ]
Said, Abas Md [1 ]
Samir, Brahim Belhaouari [3 ]
机构
[1] Univ Teknol PETRONAS, Dept Comp & Informat Sci, Tronoh, Malaysia
[2] Univ Teknol PETRONAS, Dept Fundamental & Appl Sci, Tronoh, Malaysia
[3] Alfaisal Univ, Coll Sci, Riyadh, Saudi Arabia
关键词
Bioinformatics; Feature encoding; Data mining; Superfamily; Protein classification;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many artificial intelligence techniques have been developed to process the constantly increasing volume of data to extract meaningful information from it. The accurate annotation of the unknown protein using the classification of the protein sequence into an existing superfamily is considered a critical and challenging task in bioinformatics and computational biology. This classification would be helpful in the analysis and modeling of unknown protein to determine their structure and function. In this paper, a frequency-based feature encoding technique has been used in the proposed framework to represent amino acids of a protein's primary sequence. The technique has considered the occurrence frequency of each amino acid in a sequence. Popular classification algorithms such as decision tree, naive Bayes, neural network, random forest and support vector machine have been employed to evaluate the effectiveness of the encoding method utilized in the proposed framework. Results have indicated that the decision tree classifier significantly shows better results in terms of classification accuracy, specificity, sensitivity, F-measure, etc. The classification accuracy of 88.7% was achieved over the Yeast protein sequence data taken from the well-known UniProtKB database.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Optimization of Requirement Prioritization using Computational Intelligence Technique
    Sharif, Naila
    Zafar, Kashif
    Zyad, Waqas
    2014 INTERNATIONAL CONFERENCE ON ROBOTICS AND EMERGING ALLIED TECHNOLOGIES IN ENGINEERING (ICREATE), 2014, : 228 - 234
  • [42] Computational Intelligence Technique for early Diagnosis of Heart Disease
    Jabbar, M. A.
    Deekshatulu, B. L.
    Chandra, Priti
    2015 IEEE INTERNATIONAL CONFERENCE ON ENGINEERING AND TECHNOLOGY (ICETECH), 2015, : 112 - 117
  • [43] Analysis of Electrocardiogram Signal Using Computational Intelligence Technique
    Ray, Papia
    Mandal, Kishan Kumar
    Mohanty, Biplab Kumar
    APPLICATIONS OF ARTIFICIAL INTELLIGENCE TECHNIQUES IN ENGINEERING, SIGMA 2018, VOL 1, 2019, 698 : 519 - 532
  • [44] An efficient technique for superfamily classification of amino acid sequences: feature extraction, fuzzy clustering and prototype selection
    Bandyopadhyay, S
    FUZZY SETS AND SYSTEMS, 2005, 152 (01) : 5 - 16
  • [45] Optimized Tree-Classification Algorithm for Classification of Protein Sequences
    Iqbal, Muhammad Javed
    Faye, Ibrahima
    Said, Abas Md
    Samir, Brahim Belhaouari
    2015 INTERNATIONAL SYMPOSIUM ON MATHEMATICAL SCIENCES AND COMPUTING RESEARCH (ISMSC), 2015, : 110 - 115
  • [46] An Efficient Technique for Classification of Electrocardiogram Signals
    Ebrahimzadeh, Ataollah
    Khazaee, Ali
    ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2009, 9 (03) : 89 - 93
  • [47] Efficient technique for computational design of thermoelectric materials
    Nunez-Valdez, Maribel
    Allahyari, Zahed
    Fan, Tao
    Oganov, Artem R.
    COMPUTER PHYSICS COMMUNICATIONS, 2018, 222 : 152 - 157
  • [48] AN EFFICIENT COMPUTATIONAL TECHNIQUE FOR THERMAL REACTOR TRANSIENTS
    CHO, BO
    ROBINSON, AH
    NUCLEAR SCIENCE AND ENGINEERING, 1993, 113 (03) : 264 - 270
  • [49] Computational intelligence approach for gene expression data mining and classification
    Wang, ZY
    Kung, SY
    Zhang, JY
    Khan, J
    Xuan, JH
    Wang, Y
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 449 - 452
  • [50] Description, analysis, and classification of biomedical signals: a computational intelligence approach
    Adam Gacek
    Witold Pedrycz
    Soft Computing, 2013, 17 : 1659 - 1671