Effect of Reference Genome Selection on the Performance of Computational Methods for Genome-Wide Protein-Protein Interaction Prediction

被引:18
|
作者
Muley, Vijaykumar Yogesh [1 ,2 ]
Ranjan, Akash [1 ]
机构
[1] Ctr DNA Fingerprinting & Diagnost, Computat & Funct Genom Grp, Hyderabad, Andhra Pradesh, India
[2] Dr Babasaheb Ambedkar Marathwada Univ, Dept Biotechnol, Subctr, Osmanabad, Maharashtra, India
来源
PLOS ONE | 2012年 / 7卷 / 07期
关键词
ESCHERICHIA-COLI; FUNCTIONAL LINKAGES; PHYLOGENETIC PROFILES; CONTEXT METHODS; GENE ORDER; NETWORKS; DATABASE; COEVOLUTION; EVOLUTION; CONSERVATION;
D O I
10.1371/journal.pone.0042057
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Recent progress in computational methods for predicting physical and functional protein-protein interactions has provided new insights into the complexity of biological processes. Most of these methods assume that functionally interacting proteins are likely to have a shared evolutionary history. This history can be traced out for the protein pairs of a query genome by correlating different evolutionary aspects of their homologs in multiple genomes known as the reference genomes. These methods include phylogenetic profiling, gene neighborhood and co-occurrence of the orthologous protein coding genes in the same cluster or operon. These are collectively known as genomic context methods. On the other hand a method called mirrortree is based on the similarity of phylogenetic trees between two interacting proteins. Comprehensive performance analyses of these methods have been frequently reported in literature. However, very few studies provide insight into the effect of reference genome selection on detection of meaningful protein interactions. Methods: We analyzed the performance of four methods and their variants to understand the effect of reference genome selection on prediction efficacy. We used six sets of reference genomes, sampled in accordance with phylogenetic diversity and relationship between organisms from 565 bacteria. We used Escherichia coli as a model organism and the gold standard datasets of interacting proteins reported in DIP, EcoCyc and KEGG databases to compare the performance of the prediction methods. Conclusions: Higher performance for predicting protein-protein interactions was achievable even with 100-150 bacterial genomes out of 565 genomes. Inclusion of archaeal genomes in the reference genome set improves performance. We find that in order to obtain a good performance, it is better to sample few genomes of related genera of prokaryotes from the large number of available genomes. Moreover, such a sampling allows for selecting 50-100 genomes for comparable accuracy of predictions when computational resources are limited.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Genome-wide identification, expression profiling, and protein-protein interaction properties of ovate family proteins in apple
    Huifeng Li
    Qinglong Dong
    Qiang Zhao
    Kun Ran
    Tree Genetics & Genomes, 2019, 15
  • [22] 3D genome assisted protein-protein interaction prediction
    Guo, Zehua
    Liu, Liangjie
    Feng, Mofan
    Su, Kai
    Chi, Runqiu
    Li, Keyi
    Lu, Qing
    Su, Xianbin
    Da, Lintai
    Cao, Song
    Zhang, Mingxuan
    Meng, Luming
    Cao, Dan
    Wang, Jiayi
    He, Guang
    Shi, Yi
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2022, 137 : 87 - 96
  • [23] Open source tool for prediction of genome wide protein-protein interaction network based on ortholog information
    Pedamallu, Chandra Sekhar
    Posfai, Janos
    SOURCE CODE FOR BIOLOGY AND MEDICINE, 2010, 5 (01)
  • [24] GWIDD: a comprehensive resource for genome-wide structural modeling of protein-protein interactions
    Petras J Kundrotas
    Zhengwei Zhu
    Ilya A Vakser
    Human Genomics, 6
  • [25] GWIDD: a comprehensive resource for genome-wide structural modeling of protein-protein interactions
    Kundrotas, Petras J.
    Zhu, Zhengwei
    Vakser, Ilya A.
    HUMAN GENOMICS, 2012, 6
  • [26] Interactome evolution: insights from genome-wide analyses of protein-protein interactions
    Ghadie, Mohamed A.
    Coulombe-Huntington, Jasmin
    Xia, Yu
    CURRENT OPINION IN STRUCTURAL BIOLOGY, 2018, 50 : 42 - 48
  • [27] A combined algorithm for genome-wide prediction of protein function
    Marcotte, EM
    Pellegrini, M
    Thompson, MJ
    Yeates, TO
    Eisenberg, D
    NATURE, 1999, 402 (6757) : 83 - 86
  • [28] The use of protein-protein interaction networks for genome wide protein function comparisons and predictions
    Brun, C
    Baudot, A
    Guénoche, A
    Jacq, B
    METHODS IN PROTEOME AND PROTEIN ANALYSIS, 2004, : 103 - 124
  • [29] A combined algorithm for genome-wide prediction of protein function
    Edward M. Marcotte
    Matteo Pellegrini
    Michael J. Thompson
    Todd O. Yeates
    David Eisenberg
    Nature, 1999, 402 : 83 - 86
  • [30] Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein-protein interaction dataset
    Guo, Jie
    Wu, Xiaomei
    Zhang, Da-Yong
    Lin, Kui
    NUCLEIC ACIDS RESEARCH, 2008, 36 (06) : 2002 - 2011