Prediction of the secondary structures of proteins by using PREDICT, a nearest neighbor method on pattern space

被引:0
|
作者
Joo, K [1 ]
Kim, I
Kim, SY
Lee, J
Lee, J
Lee, SJ
机构
[1] Korea Inst Adv Study, Sch Computat Sci, Seoul 130650, South Korea
[2] Soongsil Univ, Dept Bioinformat & Life Sci, Seoul, South Korea
[3] Soongsil Univ, Bioinformat & Mol Design Technol Innovat Ctr, Seoul, South Korea
[4] Soongsil Univ, Comp Aided Mol Design Res Ctr, Seoul, South Korea
[5] Univ Suwon, Dept Phys, Suwon 445890, South Korea
[6] Univ Suwon, Ctr Smart Biomat, Suwon 445890, South Korea
关键词
protein structure prediction; secondary structure prediction;
D O I
暂无
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
We introduce a novel method for predicting the secondary structure of proteins, PREDICT (PRofile Enumeration DICTionary), in which the nearest-neighbor method is applied to a pattern space. For a given protein sequence, PSI-BLAST is used to generate a profile that defines patterns for amino acid residues and their local sequence environments. By applying the PSI-BLAST to protein sequences with known secondary structures, we construct pattern databases. The secondary structure of a query residue of a protein with unknown structure can be determined by comparing the query pattern with those in the pattern databases and selecting the patterns close to the query pattern. We have tested the PREDICT on the CB513 set (a set of 513 non-homologous proteins) in three different ways. The first test was based on a pattern database derived from 7777 proteins in the Protein Data Bank (PDB), including those homologous to proteins in the CB513 set and gave an average Q(3) score of 78.8% per chain. In the second test, in order to carry out a more stringent benchmark test on the CB513 set, we removed from the 7777 proteins all proteins homologous to the CB513 set, leaving 4330 proteins. Pattern databases were constructed based on these proteins, and the average Q(3) score was 74.6%. In the third test, we selected one query protein among the CB513 set and built pattern databases by using the remaining 512 proteins. This procedure was repeated for each of the 513 proteins, and the average Q(3) score was 73.1%. Finally, we participated in the CASP5 (group ID: 531) where we employed the first-layer database based on the 7777 proteins and the second-layer database based on the CB513 set. The PREDICT gave quite promising results with an average Q(3) (Sov) score of 78.1 (77.4) % on 55 CASP5 targets.
引用
收藏
页码:1441 / 1449
页数:9
相关论文
共 50 条
  • [21] MCENN: A variant of extended nearest neighbor method for pattern recognition
    Tang, Bo
    He, Haibo
    Zhang, Song
    PATTERN RECOGNITION LETTERS, 2020, 133 : 116 - 122
  • [22] Gender Prediction by Using Local Binary Pattern and K Nearest Neighbor and Discriminant Analysis Classifications
    Camalan, Seda
    Sengul, Gokhan
    2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 2161 - 2164
  • [23] Profile-based nearest neighbor method for pattern recognition
    Joo, K
    Lee, J
    Kim, SY
    Kim, I
    Lee, J
    Lee, SJ
    JOURNAL OF THE KOREAN PHYSICAL SOCIETY, 2004, 44 (03) : 599 - 604
  • [24] Highway Travel Time Prediction Using Sparse Tensor Completion Tactics and K-Nearest Neighbor Pattern Matching Method
    Zhao, Jiandong
    Gao, Yuan
    Tang, Jinjin
    Zhu, Lingxi
    Ma, Jiaqi
    JOURNAL OF ADVANCED TRANSPORTATION, 2018,
  • [25] Fuzzy k-nearest neighbor method for protein secondary structure prediction and its parallel implementation
    Kim, Seung-Yeon
    Sim, Jaehyun
    Lee, Julian
    COMPUTATIONAL INTELLIGENCE AND BIOINFORMATICS, PT 3, PROCEEDINGS, 2006, 4115 : 444 - 453
  • [26] Damage detection of 3D structures using nearest neighbor search method
    Abasi, Ali
    Harsij, Vahid
    Soraghi, Ahmad
    EARTHQUAKE ENGINEERING AND ENGINEERING VIBRATION, 2021, 20 (03) : 705 - 725
  • [27] Damage detection of 3D structures using nearest neighbor search method
    Ali Abasi
    Vahid Harsij
    Ahmad Soraghi
    Earthquake Engineering and Engineering Vibration, 2021, 20 : 705 - 725
  • [28] Prediction of protein solvent accessibility using fuzzy k-nearest neighbor method
    Sim, J
    Kim, SY
    Lee, J
    BIOINFORMATICS, 2005, 21 (12) : 2844 - 2849
  • [29] Damage detection of 3D structures using nearest neighbor search method
    Ali Abasi
    Vahid Harsij
    Ahmad Soraghi
    Earthquake Engineering and Engineering Vibration, 2021, 20 (03) : 705 - 725
  • [30] Real Value Prediction of Solvent Accessibility by Using the k-Nearest Neighbor Method
    Lee, Julian
    JOURNAL OF THE KOREAN PHYSICAL SOCIETY, 2009, 54 (01) : 1 - 6