Genome-wide protein localization prediction strategies for gram negative bacteria

被引:25
|
作者
Romine, Margaret F. [1 ]
机构
[1] Pacific NW Natl Lab, Div Biol Sci, Richland, WA 99352 USA
来源
BMC GENOMICS | 2011年 / 12卷
关键词
OUTER-MEMBRANE PROTEIN; V SECRETION SYSTEM; SIGNAL PEPTIDES; ESCHERICHIA-COLI; SUBCELLULAR-LOCALIZATION; CYTOPLASMIC MEMBRANE; BORRELIA-BURGDORFERI; TAT PATHWAY; IDENTIFICATION; LIPOPROTEINS;
D O I
10.1186/1471-2164-12-S1-S1
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Genome-wide prediction of protein subcellular localization is an important type of evidence used for inferring protein function. While a variety of computational tools have been developed for this purpose, errors in the gene models and use of protein sorting signals that are not recognized by the more commonly accepted tools can diminish the accuracy of their output. Results: As part of an effort to manually curate the annotations of 19 strains of Shewanella, numerous insights were gained regarding the use of computational tools and proteomics data to predict protein localization. Identification of the suite of secretion systems present in each strain at the start of the process made it possible to tailor-fit the subsequent localization prediction strategies to each strain for improved accuracy. Comparisons of the computational predictions among orthologous proteins revealed inconsistencies in the computational outputs, which could often be resolved by adjusting the gene models or ortholog group memberships. While proteomic data was useful for verifying start site predictions and post-translational proteolytic cleavage, care was needed to distinguish cellular versus sample processing-mediated cleavage events. Searches for lipoprotein signal peptides revealed that neither TatP nor LipoP are designed for identification of lipoprotein substrates of the twin arginine translocation system and that the +2 rule for lipoprotein sorting does not apply to this Genus. Analysis of the relationships between domain occurrence and protein localization prediction enabled identification of numerous location-informative domains which could then be used to refine or increase confidence in location predictions. This collective knowledge was used to develop a general strategy for predicting protein localization that could be adapted to other organisms. Conclusion: Improved localization prediction accuracy is not simply a matter of developing better computational algorithms. It also entails gathering key knowledge regarding the host architecture and translocation machinery and associated substrate recognition via experimentation and integration of diverse computational analyses from many proteins and, where possible, that are derived from different species within the same genus.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Structure-based prediction of protein-protein interactions on a genome-wide scale
    Zhang, Qiangfeng C.
    Petrey, Donald
    Deng, Lei
    Califano, Andrea
    Honig, Barry
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2011, 242
  • [32] Structure-based prediction of protein-protein interactions on a genome-wide scale
    Zhang, Qiangfeng Cliff
    Petrey, Donald
    Deng, Lei
    Qiang, Li
    Shi, Yu
    Thu, Chan Aye
    Bisikirska, Brygida
    Lefebvre, Celine
    Accili, Domenico
    Hunter, Tony
    Maniatis, Tom
    Califano, Andrea
    Honig, Barry
    NATURE, 2012, 490 (7421) : 556 - +
  • [33] Genome-wide association mapping and genome-wide prediction of anther extrusion in CIMMYT spring wheat
    Muqaddasi, Quddoos H.
    Reif, Jochen C.
    Li, Zou
    Basnet, Bhoja R.
    Dreisigacker, Susanne
    Roder, Marion S.
    EUPHYTICA, 2017, 213 (03)
  • [34] Genome-wide association mapping and genome-wide prediction of anther extrusion in CIMMYT spring wheat
    Quddoos H. Muqaddasi
    Jochen C. Reif
    Zou Li
    Bhoja R. Basnet
    Susanne Dreisigacker
    Marion S. Röder
    Euphytica, 2017, 213
  • [35] Effect of Reference Genome Selection on the Performance of Computational Methods for Genome-Wide Protein-Protein Interaction Prediction
    Muley, Vijaykumar Yogesh
    Ranjan, Akash
    PLOS ONE, 2012, 7 (07):
  • [36] Genome-wide prediction of disease variant effects with a deep protein language model
    Brandes, Nadav
    Goldman, Grant
    Wang, Charlotte H. H.
    Ye, Chun Jimmie
    Ntranos, Vasilis
    NATURE GENETICS, 2023, 55 (09) : 1512 - +
  • [37] Genome-wide prediction of disease variant effects with a deep protein language model
    Nadav Brandes
    Grant Goldman
    Charlotte H. Wang
    Chun Jimmie Ye
    Vasilis Ntranos
    Nature Genetics, 2023, 55 : 1512 - 1522
  • [38] Protein subcellular localization prediction for Gram-negative bacteria using amino acid subalphabets and a combination of multiple support vector machines
    Jiren Wang
    Wing-Kin Sung
    Arun Krishnan
    Kuo-Bin Li
    BMC Bioinformatics, 6
  • [39] Protein subcellular localization prediction for Gram-negative bacteria using amino acid subalphabets and a combination of multiple support vector machines
    Wang, JR
    Sung, WK
    Krishnan, A
    Li, KB
    BMC BIOINFORMATICS, 2005, 6 (1)
  • [40] Genome-wide in silico prediction of gene expression
    McLeay, Robert C.
    Lesluyes, Tom
    Partida, Gabriel Cuellar
    Bailey, Timothy L.
    BIOINFORMATICS, 2012, 28 (21) : 2789 - 2796