共 50 条
Predicting Protein N-Terminal Signal Peptides Using Position-Specific Amino Acid Propensities and Conditional Random Fields
被引:5
|作者:
Fan, Yong-Xian
[1
]
Song, Jiangning
[2
,3
,4
]
Xu, Chen
[5
,6
]
Shen, Hong-Bin
[1
,6
,7
]
机构:
[1] Shanghai Jiao Tong Univ, Dept Automat, Minist Educ China, Shanghai 200240, Peoples R China
[2] Chinese Acad Sci, Tianjin Inst Ind Biotechnol, State Engn Lab Ind Enzymes, Tianjin 300308, Peoples R China
[3] Chinese Acad Sci, Tianjin Inst Ind Biotechnol, Tianjin 300308, Peoples R China
[4] Monash Univ, Dept Biochem & Mol Biol, Fac Med, Melbourne, Vic 3800, Australia
[5] Shanghai Jiao Tong Univ, Sch Med, Dept Histol & Embryol, Shanghai 200025, Peoples R China
[6] Shanghai Key Lab Reprod Med, Shanghai 200025, Peoples R China
[7] Shanghai Jiao Tong Univ, Shanghai Ctr Syst Biomed, Key Lab Syst Biomed, Minist Educ China, Shanghai 200240, Peoples R China
基金:
英国医学研究理事会;
澳大利亚国家健康与医学研究理事会;
中国国家自然科学基金;
关键词:
Conditional random fields (CRFs);
hydrophobicity alignment;
position-specific amino acid propensities;
secretory protein;
signal peptide;
SEQUENCE;
IDENTIFICATION;
RECOGNITION;
D O I:
10.2174/1574893611308020006
中图分类号:
Q5 [生物化学];
学科分类号:
071010 ;
081704 ;
摘要:
Protein signal peptides play a vital role in targeting and translocation of most secreted proteins and many integral membrane proteins in both prokaryotes and eukaryotes. Consequently, accurate prediction of signal peptides and their cleavage sites is an important task in molecular biology. In the present study, firstly, we develop a novel discriminative scoring method for classifying proteins with or without signal peptides. This method successfully captured the characteristics of signal peptides and non-signal peptides by integrating hydrophobicity alignment and position-specific amino acid propensities based on the highest average positions. As a result, this method is capable of discriminating proteins with signal peptides at the overall accuracies of 96.3%, 97.0% and 97.2% by leave-one-out jackknife tests on the constructed benchmark datasets for three different organisms, i.e. Eukaryotic, Gram-negative, and Gram-positive respectively. Secondly, we consider the prediction task of signal peptide cleavage sites as a sequence labeling problem and apply Conditional Random Fields (CRFs) algorithm to solve it. Experimental results demonstrate that the proposed CRFs-based cleavage site finding approach can achieve the prediction success rates of 80.8%, 89.4%, and 74.0% respectively, for the secretory proteins from three different organisms. An online tool, LnSignal, is established for labeling the N-terminal signal cleavage sites and is freely available for academic use at http://www.csbio.sjtu.edu.cn/bioinf/LnSignal.
引用
收藏
页码:183 / 192
页数:10
相关论文