Propensity Scores for Prediction and Characterization of Bioluminescent Proteins from Sequences

被引:22
|
作者
Huang, Hui-Ling [1 ,2 ]
机构
[1] Natl Chiao Tung Univ, Inst Bioinformat & Syst Biol, Hsinchu, Taiwan
[2] Natl Chiao Tung Univ, Dept Biol Sci & Technol, Hsinchu, Taiwan
来源
PLOS ONE | 2014年 / 9卷 / 05期
关键词
PHOTOPROTEIN AEQUORIN; CRYSTAL-STRUCTURE; FLUORESCENT; DATABASE; MACHINE; CELLS;
D O I
10.1371/journal.pone.0097158
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Bioluminescent proteins (BLPs) are a class of proteins with various mechanisms of light emission such as bioluminescence and fluorescence from luminous organisms. While valuable for commercial and medical applications, identification of BLPs, including luciferases and fluorescent proteins (FPs), is rather challenging, owing to their high variety of protein sequences. Moreover, characterization of BLPs facilitates mutagenesis analysis to enhance bioluminescence and fluorescence. Therefore, this study proposes a novel methodological approach to estimating the propensity scores of 400 dipeptides and 20 amino acids in order to design two prediction methods and characterize BLPs based on a scoring card method (SCM). The SCMBLP method for predicting BLPs achieves an accuracy of 90.83% for 10-fold cross-validation higher than existing support vector machine based methods and a test accuracy of 82.85%. A dataset consisting of 269 luciferases and 216 FPs is also established to design the SCMLFP prediction method, which achieves training and test accuracies of 97.10% and 96.28%, respectively. Additionally, four informative physicochemical properties of 20 amino acids are identified using the estimated propensity scores to characterize BLPs as follows: 1) high transfer free energy from inside to the protein surface, 2) high occurrence frequency of residues in the transmembrane regions of the protein, 3) large hydrophobicity scale from the native protein structure, and 4) high correlation coefficient (R = 0.921) between the amino acid compositions of BLPs and integral membrane proteins. Further analyzing BLPs reveals that luciferases have a larger value of R (0.937) than FPs (0.635), suggesting that luciferases tend to locate near the cell membrane location rather than FPs for convenient receipt of extracellular ions. Importantly, the propensity scores of dipeptides and amino acids and the identified properties facilitate efforts to predict, characterize, and apply BLPs, including luciferases, photoproteins, and FPs. The web server is available at http://iclab.life.nctu.edu.tw/SCMBLP/index.html.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Prediction and characterization of cyclic proteins from sequences in three domains of life
    Kedarisetti, Pradyumna
    Mizianty, Marcin J.
    Kaas, Quentin
    Craik, David J.
    Kurgan, Lukasz
    BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS, 2014, 1844 (01): : 181 - 190
  • [2] SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides
    Yi-Fan Liou
    Phasit Charoenkwan
    Yerukala Sathipati Srinivasulu
    Tamara Vasylenko
    Shih-Chung Lai
    Hua-Chin Lee
    Yi-Hsiung Chen
    Hui-Ling Huang
    Shinn-Ying Ho
    BMC Bioinformatics, 15
  • [3] SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides
    Liou, Yi-Fan
    Charoenkwan, Phasit
    Srinivasulu, Yerukala Sathipati
    Vasylenko, Tamara
    Lai, Shih-Chung
    Lee, Hua-Chin
    Chen, Yi-Hsiung
    Huang, Hui-Ling
    Ho, Shinn-Ying
    BMC BIOINFORMATICS, 2014, 15
  • [4] Bioluminescent Proteins Prediction with Voting Strategy
    Zhao, Shulin
    Ju, Ying
    Ye, Xiucai
    Zhang, Jun
    Han, Shuguang
    CURRENT BIOINFORMATICS, 2021, 16 (02) : 240 - 251
  • [5] Cloning and characterization of new bioluminescent proteins
    Szent-Gyorgyi, C
    Ballon, BT
    Dagnal, E
    Bryan, B
    BIOMEDICAL IMAGING: REPORTERS, DYES, AND INSTRUMENTATION, PROCEEDINGS OF, 1999, 3600 : 4 - 11
  • [6] SCMBYK: prediction and characterization of bacterial tyrosine-kinases based on propensity scores of dipeptides
    Tamara Vasylenko
    Yi-Fan Liou
    Po-Chin Chiou
    Hsiao-Wei Chu
    Yung-Sung Lai
    Yu-Ling Chou
    Hui-Ling Huang
    Shinn-Ying Ho
    BMC Bioinformatics, 17
  • [7] SCMBYK: prediction and characterization of bacterial tyrosine-kinases based on propensity scores of dipeptides
    Vasylenko, Tamara
    Liou, Yi-Fan
    Chiou, Po-Chin
    Chu, Hsiao-Wei
    Lai, Yung-Sung
    Chou, Yu-Ling
    Huang, Hui-Ling
    Ho, Shinn-Ying
    BMC BIOINFORMATICS, 2016, 17
  • [8] Unsupervised learning assisted robust prediction of bioluminescent proteins
    Nath, Abhigyan
    Subbiah, Karthikeyan
    COMPUTERS IN BIOLOGY AND MEDICINE, 2016, 68 : 27 - 36
  • [9] TranScout: prediction of gene expression regulatory proteins from their sequences
    Aguilar, D
    Oliva, B
    Aviles, FX
    Querol, E
    BIOINFORMATICS, 2002, 18 (04) : 597 - 607
  • [10] Prediction of folding and misfolding of proteins from their amino acid sequences
    Vendruscolo, Michele
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2009, 238