An improved classification of G-protein-coupled receptors using sequence-derived features

被引:37
|
作者
Peng, Zhen-Ling [2 ,3 ]
Yang, Jian-Yi [1 ]
Chen, Xin [1 ]
机构
[1] Nanyang Technol Univ, Sch Phys & Math Sci, Div Math Sci, Singapore 637371, Singapore
[2] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6G 2V4, Canada
[3] Bijie Univ, Dept Math, Bijie 551700, Guizhou, Peoples R China
来源
BMC BIOINFORMATICS | 2010年 / 11卷
关键词
AMINO-ACID-COMPOSITION; SUBCELLULAR-LOCALIZATION; STRUCTURAL CLASSES; NUCLEAR RECEPTORS; PREDICTION;
D O I
10.1186/1471-2105-11-420
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: G-protein-coupled receptors (GPCRs) play a key role in diverse physiological processes and are the targets of almost two-thirds of the marketed drugs. The 3 D structures of GPCRs are largely unavailable; however, a large number of GPCR primary sequences are known. To facilitate the identification and characterization of novel receptors, it is therefore very valuable to develop a computational method to accurately predict GPCRs from the protein primary sequences. Results: We propose a new method called PCA-GPCR, to predict GPCRs using a comprehensive set of 1497 sequence-derived features. The principal component analysis is first employed to reduce the dimension of the feature space to 32. Then, the resulting 32-dimensional feature vectors are fed into a simple yet powerful classification algorithm, called intimate sorting, to predict GPCRs at five levels. The prediction at the first level determines whether a protein is a GPCR or a non-GPCR. If it is predicted to be a GPCR, then it will be further predicted into certain family, subfamily, sub-subfamily and subtype by the classifiers at the second, third, fourth, and fifth levels, respectively. To train the classifiers applied at five levels, a non-redundant dataset is carefully constructed, which contains 3178, 1589, 4772, 4924, and 2741 protein sequences at the respective levels. Jackknife tests on this training dataset show that the overall accuracies of PCA-GPCR at five levels (from the first to the fifth) can achieve up to 99.5%, 88.8%, 80.47%, 80.3%, and 92.34%, respectively. We further perform predictions on a dataset of 1238 GPCRs at the second level, and on another two datasets of 167 and 566 GPCRs respectively at the fourth level. The overall prediction accuracies of our method are consistently higher than those of the existing methods to be compared. Conclusions: The comprehensive set of 1497 features is believed to be capable of capturing information about amino acid composition, sequence order as well as various physicochemical properties of proteins. Therefore, high accuracies are achieved when predicting GPCRs at all the five levels with our proposed method.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] An improved classification of G-protein-coupled receptors using sequence-derived features
    Zhen-Ling Peng
    Jian-Yi Yang
    Xin Chen
    [J]. BMC Bioinformatics, 11
  • [2] Sequence-Derived Three-Dimensional Pharmacophore Models for G-Protein-Coupled Receptors and Their Application in Virtual Screening
    Klabunde, Thomas
    Giegerich, Clemens
    Evers, Andreas
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2009, 52 (09) : 2923 - 2932
  • [3] Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features
    Amit Kumar Banerjee
    Vadlamani Ravi
    U. S. N. Murty
    Neelava Sengupta
    Batepatti Karuna
    [J]. Applied Biochemistry and Biotechnology, 2013, 170 : 1263 - 1281
  • [4] A statistical model for improved membrane protein expression using sequence-derived features
    Saladi, Shyam M.
    Javed, Nauman
    Muller, Axel
    Clemons, William M., Jr.
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 2018, 293 (13) : 4913 - 4927
  • [5] Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features
    Banerjee, Amit Kumar
    Ravi, Vadlamani
    Murty, U. S. N.
    Sengupta, Neelava
    Karuna, Batepatti
    [J]. APPLIED BIOCHEMISTRY AND BIOTECHNOLOGY, 2013, 170 (06) : 1263 - 1281
  • [6] Bioinformatics approaches for the classification of G-protein-coupled receptors
    Gaulton, A
    Attwood, TK
    [J]. CURRENT OPINION IN PHARMACOLOGY, 2003, 3 (02) : 114 - 120
  • [7] Bin classification of G-protein-coupled peptide receptors
    Henri Moereels
    Paul J. Lewi
    Frits Daeyaert
    Paul A. J. Janssen
    [J]. Letters in Peptide Science, 1998, 5 : 139 - 142
  • [8] Bin classification of G-protein-coupled peptide receptors
    Henri Moereels
    Paul J. Lewi
    Frits Daeyaert
    Paul A.J. Janssen
    [J]. Letters in Peptide Science, 1998, 5 : 139 - 142
  • [9] Bin classification of G-protein-coupled peptide receptors
    Moereels, H
    Lewi, PJ
    Daeyaert, F
    Janssen, PAJ
    [J]. LETTERS IN PEPTIDE SCIENCE, 1998, 5 (2-3): : 139 - 142
  • [10] Activation of Adhesion G Protein-coupled Receptors AGONIST SPECIFICITY OF STACHEL SEQUENCE-DERIVED PEPTIDES
    Demberg, Lilian M.
    Winkler, Jana
    Wilde, Caroline
    Simon, Kay-Uwe
    Schoen, Julia
    Rothemund, Sven
    Schoeneberg, Torsten
    Proemel, Simone
    Liebscher, Ines
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 2017, 292 (11) : 4383 - 4394