PCI-SS: MISO dynamic nonlinear protein secondary structure prediction

被引:17
|
作者
Green, James R. [1 ]
Korenberg, Michael J. [2 ]
Aboul-Magd, Mohammed O. [1 ]
机构
[1] Carleton Univ, Dept Syst & Comp Engn, Ottawa, ON K1S 5B6, Canada
[2] Queens Univ, Dept Elect & Comp Engn, Kingston, ON, Canada
来源
BMC BIOINFORMATICS | 2009年 / 10卷
基金
加拿大自然科学与工程研究理事会;
关键词
PARALLEL CASCADE IDENTIFICATION; NEURAL-NETWORKS; SERVER; RECOGNITION; CLASSIFIERS; SEQUENCES; ACCURACY; FEATURES; SINGLE;
D O I
10.1186/1471-2105-10-222
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Since the function of a protein is largely dictated by its three dimensional configuration, determining a protein's structure is of fundamental importance to biology. Here we report on a novel approach to determining the one dimensional secondary structure of proteins (distinguishing alpha-helices, beta-strands, and non-regular structures) from primary sequence data which makes use of Parallel Cascade Identification (PCI), a powerful technique from the field of nonlinear system identification. Results: Using PSI-BLAST divergent evolutionary profiles as input data, dynamic nonlinear systems are built through a black-box approach to model the process of protein folding. Genetic algorithms (GAs) are applied in order to optimize the architectural parameters of the PCI models. The three-state prediction problem is broken down into a combination of three binary sub-problems and protein structure classifiers are built using 2 layers of PCI classifiers. Careful construction of the optimization, training, and test datasets ensures that no homology exists between any training and testing data. A detailed comparison between PCI and 9 contemporary methods is provided over a set of 125 new protein chains guaranteed to be dissimilar to all training data. Unlike other secondary structure prediction methods, here a web service is developed to provide both human- and machine-readable interfaces to PCI-based protein secondary structure prediction. This server, called PCI-SS, is available at http://bioinf.sce.carleton.ca/PCISS. In addition to a dynamic PHP-generated web interface for humans, a Simple Object Access Protocol (SOAP) interface is added to permit invocation of the PCI-SS service remotely. This machine-readable interface facilitates incorporation of PCI-SS into multi-faceted systems biology analysis pipelines requiring protein secondary structure information, and greatly simplifies high-throughput analyses. XML is used to represent the input protein sequence data and also to encode the resulting structure prediction in a machine-readable format. To our knowledge, this represents the only publicly available SOAP-interface for a protein secondary structure prediction service with published WSDL interface definition. Conclusion: Relative to the 9 contemporary methods included in the comparison cascaded PCI classifiers perform well, however PCI finds greatest application as a consensus classifier. When PCI is used to combine a sequence-to-structure PCI-based classifier with the current leading ANN-based method, PSIPRED, the overall error rate (Q3) is maintained while the rate of occurrence of a particularly detrimental error is reduced by up to 25%. This improvement in BAD score, combined with the machine-readable SOAP web service interface makes PCI-SS particularly useful for inclusion in a tertiary structure prediction pipeline.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] PCI-SS: MISO dynamic nonlinear protein secondary structure prediction
    James R Green
    Michael J Korenberg
    Mohammed O Aboul-Magd
    [J]. BMC Bioinformatics, 10
  • [2] PCI-SS: Web-based human and machine interfaces for protein secondary structure prediction
    Aboul-Magd, Mohammed
    Green, James R.
    [J]. 2008 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-4, 2008, : 1575 - 1578
  • [3] Prediction of protein Secondary Structure using nonlinear method
    Botelho, Silvia
    Simas, Gisele
    Silveira, Patricia
    [J]. NEURAL INFORMATION PROCESSING, PT 3, PROCEEDINGS, 2006, 4234 : 40 - 47
  • [4] Protein Secondary Structure Prediction Using Dynamic Programming
    Jing ZHAO Pei-Ming SONG Qing FANG Jian-Hua LUO School of Life Science & Technology
    Shanghai Center for Bioinformation and Technology
    Logistical Engineering University
    [J]. Acta Biochimica et Biophysica Sinica, 2005, (03) : 167 - 172
  • [5] Protein secondary structure prediction using dynamic programming
    Zhao, J
    Song, PM
    Fang, Q
    Luo, JH
    [J]. ACTA BIOCHIMICA ET BIOPHYSICA SINICA, 2005, 37 (03) : 167 - 172
  • [6] A dynamic Bayesian network approach to protein secondary structure prediction
    Yao, Xin-Qiu
    Zhu, Huaiqiu
    She, Zhen-Su
    [J]. BMC BIOINFORMATICS, 2008, 9 (1)
  • [7] A dynamic Bayesian network approach to protein secondary structure prediction
    Xin-Qiu Yao
    Huaiqiu Zhu
    Zhen-Su She
    [J]. BMC Bioinformatics, 9
  • [8] Dimensional reduction in the protein secondary structure prediction - Nonlinear method improvements
    Simas, Gisele M.
    Botelho, Silvia S. C.
    Grando, Neusa
    Colares, Rafael G.
    [J]. INNOVATIONS IN HYBRID INTELLIGENT SYSTEMS, 2007, 44 : 425 - +
  • [9] PREDICTION OF PROTEIN SECONDARY STRUCTURE
    CHOU, PY
    FASMAN, GD
    [J]. BIOPHYSICAL JOURNAL, 1977, 17 (02) : A53 - A53
  • [10] PREDICTION OF PROTEIN SECONDARY STRUCTURE
    MRAZEK, J
    KYPR, J
    [J]. CHEMICKE LISTY, 1991, 85 (12): : 1203 - 1218