Protein secondary structure prediction using DWKF based on SVR-NSGAII

被引:21
|
作者
Zangooei, Mohammad Hossein [1 ]
Jalili, Saeed [1 ]
机构
[1] Tarbiat Modares Univ, Elect & Comp Engn Fac, Dept Comp Engn, SCS Lab,Sch Elect & Comp Engn, Tehran, Iran
关键词
Protein secondary structure prediction; Machine learning approach; Support vector regression; Multi-Objective Genetic Algorithm; SUPPORT VECTOR MACHINES; NEURAL-NETWORKS; ACCURACY; SUBSTITUTION; ALGORITHMS; ALIGNMENT; HELICES;
D O I
10.1016/j.neucom.2012.04.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prediction of protein secondary structure is an important step towards elucidating its three dimensional structure and its function. This is a challenging problem in bioinformatics. By introduction of machine learning for protein structure prediction, a solution has brought to this challenge to some extent. In the literature of Machine learning or data mining, regression and classification problems are typically viewed as two distinct problems differentiated by continuous or categorical dependent variable. There are endeavors to use regression methods to solve the classification problem and vice versa. To regard a classification problem as a regression one, we proposed a method which is based on Support Vector Regression (SVR) classification model as one of the powerful methods in the field of machine intelligence. We applied non-dominated Sorting Genetic Algorithm II (NSGAII) to find mapping points (MPs) for rounding a real-value to an integer one. Also NSGAII is used for finding out and tuning SVR kernel parameters optimally to enhance the performance of our model and achieve better results. At the other hand, using a suitable SVR kernel function for a particular problem can improve the prediction results remarkably but there is not a kernel which can predict all protein secondary structure classes with acceptable accuracy. Therefore we use a Dynamic Weighted Kernel Fusion (DWKF) method for fusing of three SVR kernels to achieve a supreme performance. Also to improve our method, Position Scoring Matrix (PSSM) profiles are used as the input information to it. The goals of this research are to regulate SVR parameters and fuse different SVR kernel outputs in order to determine protein secondary structure classes accurately. The obtained classification accuracies of our method are 85.79% and 84.94% on RS126 and CB513 datasets respectively and they are promising with regard to other classification methods in the literature. Moreover, for gauging our method behavior in comparison to other state of arts methods, an independent dataset is used and achieves 81.4% accuracy. Our method cannot achieve the best value for any considered performance metrics on an independent dataset but its values for whole metrics are quite acceptable. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:87 / 101
页数:15
相关论文
共 50 条
  • [1] A Multiobjective RNA Secondary Structure Prediction Algorithm Based on NSGAII
    Zhang, Kai
    Lv, Yulin
    2018 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2018, : 1450 - 1454
  • [2] Protein secondary structure prediction using distance based classifiers
    Ghosh, Ashish
    Parai, Bijnan
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2008, 47 (01) : 37 - 44
  • [3] PROTEIN SECONDARY STRUCTURE PREDICTION USING KNOWLEDGE-BASED POTENTIALS
    Saraswathi, Saras
    Jernigan, Robert L.
    Kloczkowski, Andrzej
    Kolinski, Andrzej
    ICFC 2010/ ICNC 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON FUZZY COMPUTATION AND INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION, 2010, : 370 - 375
  • [4] AN ALGORITHM FOR PROTEIN SECONDARY STRUCTURE PREDICTION BASED ON CLASS PREDICTION
    DELEAGE, G
    ROUX, B
    PROTEIN ENGINEERING, 1987, 1 (04): : 289 - 294
  • [5] Prediction of protein solvent profile using SVR
    Yuan, Z
    Bailey, TL
    PROCEEDINGS OF THE 26TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2004, 26 : 2889 - 2892
  • [6] PROTEIN SECONDARY STRUCTURE PREDICTION USING LOGIC-BASED MACHINE LEARNING
    MUGGLETON, S
    KING, RD
    STERNBERG, MJE
    PROTEIN ENGINEERING, 1992, 5 (07): : 647 - 657
  • [7] PREDICTION OF PROTEIN SECONDARY STRUCTURE USING PROBABILITY BASED FEATURES AND A HYBRID SYSTEM
    Ghanty, Pradip
    Pal, Nikhil R.
    Mudi, Rajani K.
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2013, 11 (05)
  • [8] PROTEIN SECONDARY STRUCTURE PREDICTION USING LOGIC-BASED MACHINE LEARNING
    MUGGLETON, S
    KING, RD
    STERNBERG, MJE
    PROTEIN ENGINEERING, 1993, 6 (05): : 549 - 549
  • [9] Protein Secondary Structure Prediction Using Dynamic Programming
    Jing ZHAO Pei-Ming SONG Qing FANG Jian-Hua LUO School of Life Science & Technology
    Shanghai Center for Bioinformation and Technology
    Logistical Engineering University
    Acta Biochimica et Biophysica Sinica, 2005, (03) : 167 - 172
  • [10] Protein secondary structure prediction using machine learning
    Zhang, BF
    Chen, ZH
    Murphey, YL
    Proceedings of the International Joint Conference on Neural Networks (IJCNN), Vols 1-5, 2005, : 532 - 537