Automatic motif discovery in an enzyme database using a genetic algorithm-based approach

被引:0
|
作者
D. F. Tsunoda
H. S. Lopes
机构
[1] CEFET-PR,Laboratório de Bioinformática / CPGEI
来源
Soft Computing | 2006年 / 10卷
关键词
Genetic Algorithm; Final Performance; Evolutionary Computation; Heuristic Method; Motif Discovery;
D O I
暂无
中图分类号
学科分类号
摘要
Proteins can be grouped into families according to some features such as hydrophobicity, composition or structure, aiming to establish the common biological functions. This paper presents a system that was conceived to discover features (particular sequences of amino acids, or motifs) that occur very often in proteins of a given family but rarely occur in proteins of other families. These features can be used for the classification of unknown proteins, that is, to predict their function by analyzing the primary structure. Runnings were done with the enzymes subset extracted from the Protein Data Bank. The heuristic method used was based on a genetic algorithm using specially tailored operators for the problem. Motifs found were used to build a decision tree using the C4.5 algorithm. The results were compared with motifs found by MEME, a freely available web tool. Another comparison was made with classification results of other two systems: a neural network-based tool and a hidden Markov model-based tool. The final performance was measured using sensitivity (Se) and specificity (Sp): similar results were obtained for the proposed tool (78.79 and 95.82) and the neural network-based tool (74.65 and 94.80, respectively), while MEME and HMMER resulted in an inferior performance. The proposed system has the advantage of giving comprehensible rules when compared with the other approaches. These results obtained for the enzyme dataset suggest that the evolutionary computation method proposed is very efficient to find patterns for protein classification.
引用
收藏
页码:325 / 330
页数:5
相关论文
共 50 条
  • [1] Automatic motif discovery in an enzyme database using a genetic algorithm-based approach
    Tsunoda, DF
    Lopes, HS
    [J]. SOFT COMPUTING, 2006, 10 (04) : 325 - 330
  • [2] A genetic algorithm-based clustering approach for database partitioning
    Cheng, CH
    Lee, WK
    Wong, KF
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2002, 32 (03): : 215 - 230
  • [3] A Genetic algorithm-Based Approach for Classification Rule Discovery
    Shi, Xian-Jun
    Lei, Hong
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT, INNOVATION MANAGEMENT AND INDUSTRIAL ENGINEERING, VOL 1, 2008, : 175 - 178
  • [4] Motif discovery using an immune genetic algorithm
    Luo Jia-wei
    Wang Ting
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2010, 264 (02) : 319 - 325
  • [5] MDGA: Motif Discovery using a Genetic Algorithm
    Che, Dongsheng
    Song, Yinglei
    Rasheed, Khaled
    [J]. GECCO 2005: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOLS 1 AND 2, 2005, : 447 - 452
  • [6] A genetic algorithm-based segmentation for automatic VOP generation
    Kim, EY
    Park, SH
    [J]. PROTOCOLS AND SYSTEMS FOR INTERACTIVE DISTRIBUTED MULTIMEDIA, PROCEEDINGS, 2002, 2515 : 106 - 117
  • [7] Automatic text summarization with genetic algorithm-based attribute selection
    Silla, CN
    Pappa, GL
    Freitas, AA
    Kaestner, CAA
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2004, 2004, 3315 : 305 - 314
  • [8] A Genetic Algorithm-Based Method for the Automatic Reduction of Reaction Mechanisms
    Sikalo, N.
    Hasemann, O.
    Schulz, C.
    Kempf, A.
    Wlokas, I.
    [J]. INTERNATIONAL JOURNAL OF CHEMICAL KINETICS, 2014, 46 (01) : 41 - 59
  • [9] Optimizing genetic algorithm for motif discovery
    Huo, Hongwei
    Zhao, Zhenhua
    Stojkovic, Vojislav
    Liu, Lifang
    [J]. MATHEMATICAL AND COMPUTER MODELLING, 2010, 52 (11-12) : 2011 - 2020
  • [10] Generic Spaced DNA Motif Discovery Using Genetic Algorithm
    Chan, Tak-Ming
    Leung, Kwong-Sak
    Lee, Kin-Hong
    Lio, Pietro
    [J]. 2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,