Plus ca change - evolutionary sequence divergence predicts protein subcellular localization signals

被引:11
|
作者
Fukasawa, Yoshinori [1 ,2 ]
Leung, Ross K. K. [3 ,4 ]
Tsui, Stephen K. W. [3 ,4 ]
Horton, Paul [1 ,5 ]
机构
[1] Univ Tokyo, Grad Sch Frontier Sci, Dept Computat Biol, Kashiwa, Chiba, Japan
[2] Japan Soc Promot Sci, Tokyo Chiyoda, Japan
[3] Chinese Univ Hong Kong, Hong Kong Bioinformat Ctr, Shatin, Hong Kong, Peoples R China
[4] Chinese Univ Hong Kong, Sch Biomed Sci, Shatin, Hong Kong, Peoples R China
[5] Natl Inst Adv Ind Sci & Technol, Computat Biol Res Ctr, Tokyo, Japan
来源
BMC GENOMICS | 2014年 / 15卷
关键词
MITOCHONDRIAL PRESEQUENCES; AMINO-ACIDS; LOCATIONS; CONSERVATION; MULTICLASS; RESIDUES; PATTERNS; PEPTIDE; TOM20;
D O I
10.1186/1471-2164-15-46
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Protein subcellular localization is a central problem in understanding cell biology and has been the focus of intense research. In order to predict localization from amino acid sequence a myriad of features have been tried: including amino acid composition, sequence similarity, the presence of certain motifs or domains, and many others. Surprisingly, sequence conservation of sorting motifs has not yet been employed, despite its extensive use for tasks such as the prediction of transcription factor binding sites. Results: Here, we flip the problem around, and present a proof of concept for the idea that the lack of sequence conservation can be a novel feature for localization prediction. We show that for yeast, mammal and plant datasets, evolutionary sequence divergence alone has significant power to identify sequences with N-terminal sorting sequences. Moreover sequence divergence is nearly as effective when computed on automatically defined ortholog sets as on hand curated ones. Unfortunately, sequence divergence did not necessarily increase classification performance when combined with some traditional sequence features such as amino acid composition. However a post-hoc analysis of the proteins in which sequence divergence changes the prediction yielded some proteins with atypical (i.e. not MPP-cleaved) matrix targeting signals as well as a few misannotations. Conclusion: We report the results of the first quantitative study of the effectiveness of evolutionary sequence divergence as a feature for protein subcellular localization prediction. We show that divergence is indeed useful for prediction, but it is not trivial to improve overall accuracy simply by adding this feature to classical sequence features. Nevertheless we argue that sequence divergence is a promising feature and show anecdotal examples in which it succeeds where other features fail.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Plus ça change – evolutionary sequence divergence predicts protein subcellular localization signals
    Yoshinori Fukasawa
    Ross KK Leung
    Stephen KW Tsui
    Paul Horton
    [J]. BMC Genomics, 15
  • [2] Protein sorting signals and prediction of subcellular localization
    Nakai, K
    [J]. ADVANCES IN PROTEIN CHEMISTRY, VOL 54: ANALYSIS OF AMINO ACID SEQUENCES, 2000, 54 : 277 - 344
  • [3] Improving prediction of protein subcellular localization using evolutionary information and sequence-order information
    Wang, Minghui
    Li, Ao
    Xie, Dan
    Fan, Zhewen
    Jiang, Zhaohui
    Feng, Huanqing
    [J]. 2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 4434 - 4436
  • [4] Classification of protein motifs based on subcellular localization uncovers evolutionary relationships at both sequence and functional levels
    Marcos Parras-Moltó
    Francisco J Campos-Laborie
    Juan García-Diéguez
    M Rosario Rodríguez-Griñolo
    Antonio J Pérez-Pulido
    [J]. BMC Bioinformatics, 14
  • [5] Classification of protein motifs based on subcellular localization uncovers evolutionary relationships at both sequence and functional levels
    Parras-Molto, Marcos
    Campos-Laborie, Francisco J.
    Garcia-Dieguez, Juan
    Rosario Rodriguez-Grinolo, M.
    Perez-Pulido, Antonio J.
    [J]. BMC BIOINFORMATICS, 2013, 14
  • [6] Prediction of the subcellular localization of eukaryotic proteins using sequence signals and composition
    Reczko, M
    Hatzigeorgiou, A
    [J]. PROTEOMICS, 2004, 4 (06) : 1591 - 1596
  • [7] Protein Subcellular Localization Based on Evolutionary Information and Segmented Distribution
    Jin, Danyu
    Zhu, Ping
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021
  • [8] Protein Subcellular Localization Based on Evolutionary Information and Segmented Distribution
    Jin, Danyu
    Zhu, Ping
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021
  • [9] Prediction of protein subcellular localization based on primary sequence data
    Özarar, M
    Atalay, V
    Atalay, RÇ
    [J]. PROCEEDINGS OF THE IEEE 12TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, 2004, : 118 - 120
  • [10] Prediction of protein subcellular localization based on primary sequence data
    Özarar, M
    Atalay, V
    Atalay, RÇ
    [J]. COMPUTER AND INFORMATION SCIENCES - ISCIS 2003, 2003, 2869 : 611 - 618