Propensity Scores for Prediction and Characterization of Bioluminescent Proteins from Sequences

被引:22
|
作者
Huang, Hui-Ling [1 ,2 ]
机构
[1] Natl Chiao Tung Univ, Inst Bioinformat & Syst Biol, Hsinchu, Taiwan
[2] Natl Chiao Tung Univ, Dept Biol Sci & Technol, Hsinchu, Taiwan
来源
PLOS ONE | 2014年 / 9卷 / 05期
关键词
PHOTOPROTEIN AEQUORIN; CRYSTAL-STRUCTURE; FLUORESCENT; DATABASE; MACHINE; CELLS;
D O I
10.1371/journal.pone.0097158
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Bioluminescent proteins (BLPs) are a class of proteins with various mechanisms of light emission such as bioluminescence and fluorescence from luminous organisms. While valuable for commercial and medical applications, identification of BLPs, including luciferases and fluorescent proteins (FPs), is rather challenging, owing to their high variety of protein sequences. Moreover, characterization of BLPs facilitates mutagenesis analysis to enhance bioluminescence and fluorescence. Therefore, this study proposes a novel methodological approach to estimating the propensity scores of 400 dipeptides and 20 amino acids in order to design two prediction methods and characterize BLPs based on a scoring card method (SCM). The SCMBLP method for predicting BLPs achieves an accuracy of 90.83% for 10-fold cross-validation higher than existing support vector machine based methods and a test accuracy of 82.85%. A dataset consisting of 269 luciferases and 216 FPs is also established to design the SCMLFP prediction method, which achieves training and test accuracies of 97.10% and 96.28%, respectively. Additionally, four informative physicochemical properties of 20 amino acids are identified using the estimated propensity scores to characterize BLPs as follows: 1) high transfer free energy from inside to the protein surface, 2) high occurrence frequency of residues in the transmembrane regions of the protein, 3) large hydrophobicity scale from the native protein structure, and 4) high correlation coefficient (R = 0.921) between the amino acid compositions of BLPs and integral membrane proteins. Further analyzing BLPs reveals that luciferases have a larger value of R (0.937) than FPs (0.635), suggesting that luciferases tend to locate near the cell membrane location rather than FPs for convenient receipt of extracellular ions. Importantly, the propensity scores of dipeptides and amino acids and the identified properties facilitate efforts to predict, characterize, and apply BLPs, including luciferases, photoproteins, and FPs. The web server is available at http://iclab.life.nctu.edu.tw/SCMBLP/index.html.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] SCMMTP: identifying and characterizing membrane transport proteins using propensity scores of dipeptides
    Liou, Yi-Fan
    Vasylenko, Tamara
    Yeh, Chia-Lun
    Lin, Wei-Chun
    Chiu, Shih-Hsiang
    Charoenkwan, Phasit
    Shu, Li-Sun
    Ho, Shinn-Ying
    Huang, Hui-Ling
    BMC GENOMICS, 2015, 16
  • [22] SCMMTP: identifying and characterizing membrane transport proteins using propensity scores of dipeptides
    Yi-Fan Liou
    Tamara Vasylenko
    Chia-Lun Yeh
    Wei-Chun Lin
    Shih-Hsiang Chiu
    Phasit Charoenkwan
    Li-Sun Shu
    Shinn-Ying Ho
    Hui-Ling Huang
    BMC Genomics, 16
  • [23] Learning From an Association Analysis Using Propensity Scores
    Kreif, Noemi
    PEDIATRIC CRITICAL CARE MEDICINE, 2021, 22 (12) : 1088 - 1092
  • [24] Propensity scores: From naive enthusiasm to intuitive understanding
    Williamson, Elizabeth
    Morley, Ruth
    Lucas, Alan
    Carpenter, James
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2012, 21 (03) : 273 - 293
  • [25] PREDICTION OF THE ACTIVE-SITES OF PROTEINS FROM AMINO-ACID-SEQUENCES
    NUMAO, N
    KIDOKORO, S
    BIOLOGICAL & PHARMACEUTICAL BULLETIN, 1993, 16 (11) : 1160 - 1163
  • [26] Adding propensity scores to pure prediction models fails to improve predictive performance
    Nowacki, Amy S.
    Wells, Brian J.
    Yu, Changhong
    Kattan, Michael W.
    PEERJ, 2013, 1
  • [27] Automatic Discovery of Bioluminescent Proteins from Large Protein Databases
    Meng, Tao
    Shyu, Mei-Ling
    Zhang, Hua
    2013 IEEE SEVENTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2013), 2013, : 355 - 362
  • [28] PROTON NMR AND FLUORESCENCE SPECTROSCOPY OF BIOLUMINESCENT PROTEINS FROM JELLYFISH
    KEMPLE, MD
    RAO, BDN
    FEDERATION PROCEEDINGS, 1980, 39 (06) : 1677 - 1677
  • [29] BLProt: prediction of bioluminescent proteins based on support vector machine and relieff feature selection
    Krishna Kumar Kandaswamy
    Ganesan Pugalenthi
    Mehrnaz Khodam Hazrati
    Kai-Uwe Kalies
    Thomas Martinetz
    BMC Bioinformatics, 12
  • [30] Characterization of soluble artificial proteins with random sequences
    Yamauchi, A
    Yomo, T
    Tanaka, F
    Prijambada, ID
    Ohhashi, S
    Yamamoto, K
    Shima, Y
    Ogasahara, K
    Yutani, K
    Kataoka, M
    Urabe, I
    FEBS LETTERS, 1998, 421 (02) : 147 - 151