Prediction of the secondary structures of proteins by using PREDICT, a nearest neighbor method on pattern space

被引：0

作者：

Joo, K ^{[1
]}

Kim, I

Kim, SY

Lee, J

Lee, SJ

机构：

[1] Korea Inst Adv Study, Sch Computat Sci, Seoul 130650, South Korea

[2] Soongsil Univ, Dept Bioinformat & Life Sci, Seoul, South Korea

[3] Soongsil Univ, Bioinformat & Mol Design Technol Innovat Ctr, Seoul, South Korea

[4] Soongsil Univ, Comp Aided Mol Design Res Ctr, Seoul, South Korea

[5] Univ Suwon, Dept Phys, Suwon 445890, South Korea

[6] Univ Suwon, Ctr Smart Biomat, Suwon 445890, South Korea

来源：

JOURNAL OF THE KOREAN PHYSICAL SOCIETY | 2004年 / 45卷 / 06期

关键词：

protein structure prediction; secondary structure prediction;

D O I：

暂无

中图分类号：

O4 [物理学];

学科分类号：

0702 ;

摘要：

We introduce a novel method for predicting the secondary structure of proteins, PREDICT (PRofile Enumeration DICTionary), in which the nearest-neighbor method is applied to a pattern space. For a given protein sequence, PSI-BLAST is used to generate a profile that defines patterns for amino acid residues and their local sequence environments. By applying the PSI-BLAST to protein sequences with known secondary structures, we construct pattern databases. The secondary structure of a query residue of a protein with unknown structure can be determined by comparing the query pattern with those in the pattern databases and selecting the patterns close to the query pattern. We have tested the PREDICT on the CB513 set (a set of 513 non-homologous proteins) in three different ways. The first test was based on a pattern database derived from 7777 proteins in the Protein Data Bank (PDB), including those homologous to proteins in the CB513 set and gave an average Q(3) score of 78.8% per chain. In the second test, in order to carry out a more stringent benchmark test on the CB513 set, we removed from the 7777 proteins all proteins homologous to the CB513 set, leaving 4330 proteins. Pattern databases were constructed based on these proteins, and the average Q(3) score was 74.6%. In the third test, we selected one query protein among the CB513 set and built pattern databases by using the remaining 512 proteins. This procedure was repeated for each of the 513 proteins, and the average Q(3) score was 73.1%. Finally, we participated in the CASP5 (group ID: 531) where we employed the first-layer database based on the 7777 proteins and the second-layer database based on the CB513 set. The PREDICT gave quite promising results with an average Q(3) (Sov) score of 78.1 (77.4) % on 55 CASP5 targets.

引用

页码：1441 / 1449

页数：9

共 50 条

[1] Sann: Solvent accessibility prediction of proteins by nearest neighbor method
Joo, Keehyoung
Lee, Sung Jong
Lee, Jooyoung
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2012, 80 (07) : 1791 - 1797
[2] Prediction of Pareto Dominance Using Nearest Neighbor Method Based on Decision Space Transformation
Li W.-B.
He J.-J.
Feng C.-Y.
Guo G.-Q.
Guo, Guan-Qi (gq.guo@163.com), 1600, Science Press (43): : 294 - 301
[3] PROTEIN SECONDARY STRUCTURE PREDICTION USING NEAREST-NEIGHBOR METHODS
YI, TM
LANDER, ES
JOURNAL OF MOLECULAR BIOLOGY, 1993, 232 (04) : 1117 - 1129
[4] Protein β-turn prediction using nearest-neighbor method
Kim, S
BIOINFORMATICS, 2004, 20 (01) : 40 - 44
[5] Rockburst prediction method based on K-nearest neighbor pattern recognition
Su Guoshao
Lei Wenjie
Zhang Xiaofei
Progress in Mining Science and Safety Technology, Pts A and B, 2007, : 840 - 845
[6] A Nearest Neighbor Method for Predicting Solenoid Proteins
Cheng, Wen
Sanjaka, Malinda
Yan, Changhui
2012 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC 2012), 2012, : 68 - 71
[7] Space Efficient Data Structures for Nearest Larger Neighbor
Jayapaul, Varunkumar
Jo, Seungbum
Raman, Venkatesh
Satti, Srinivasa Rao
COMBINATORIAL ALGORITHMS, IWOCA 2014, 2015, 8986 : 176 - 187
[8] Space efficient data structures for nearest larger neighbor
Jayapaul, Varunkumar
Jo, Seungbum
Raman, Rajeev
Raman, Venkatesh
Satti, Srinivasa Rao
JOURNAL OF DISCRETE ALGORITHMS, 2016, 36 : 63 - 75
[9] Prediction of Secondary Structures of Proteins Using a Two-Stage Method
Turkay, Metin
Yilmaz, Ozlem
Yuksektepe, Fadime Uney
16TH EUROPEAN SYMPOSIUM ON COMPUTER AIDED PROCESS ENGINEERING AND 9TH INTERNATIONAL SYMPOSIUM ON PROCESS SYSTEMS ENGINEERING, 2006, 21 : 1679 - 1685
[10] Prediction of secondary structures of proteins using a two-stage method
Yueksektepe, Fadime Ueney
Yilmaz, Oezlem
Tuerkay, Metin
COMPUTERS & CHEMICAL ENGINEERING, 2008, 32 (1-2) : 78 - 88

← 1 2 3 4 5 →