Predicting human protein subcellular localization by heterogeneous and comprehensive approaches

被引:6
|
作者
Tung, Chi-Hua [1 ]
Chen, Chi-Wei [2 ]
Sun, Han-Hao [2 ]
Chu, Yen-Wei [2 ,3 ]
机构
[1] Chung Hua Univ, Dept Bioinformat, Hsinchu, Taiwan
[2] Natl Chung Hsing Univ 250, Inst Genom & Bioinformat, Taichung 402, Taiwan
[3] Natl Chung Hsing Univ 250, Ctr Biotechnol, Agr Biotechnol Ctr, Inst Mol Biol,Grad Inst Biotechnol, Taichung 402, Taiwan
来源
PLOS ONE | 2017年 / 12卷 / 06期
关键词
GENE ONTOLOGY TERMS; SINGLE; SITES; MODES;
D O I
10.1371/journal.pone.0178832
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Drug development and investigation of protein function both require an understanding of protein subcellular localization. We developed a system, REALoc, that can predict the subcellular localization of singleplex and multiplex proteins in humans. This system, based on comprehensive strategy, consists of two heterogeneous systematic frameworks that integrate one-to-one and many-to-many machine learning methods and use sequence-based features, including amino acid composition, surface accessibility, weighted sign aa index, and sequence similarity profile, as well as gene ontology function-based features. REALoc can be used to predict localization to six subcellular compartments (cell membrane, cytoplasm, endoplasmic reticulum/Golgi, mitochondrion, nucleus, and extracellular). REALoc yielded a 75.3% absolute true success rate during five-fold cross-validation and a 57.1% absolute true success rate in an independent database test, which was >10% higher than six other prediction systems. Lastly, we analyzed the effects of Vote and GANN models on singleplex and multiplex localization prediction efficacy. REALoc is freely available at http://predictor.nchu.edu.tw/REALoc.
引用
收藏
页数:14
相关论文
共 50 条