An Integrative Method for Identifying the Over-Annotated Protein-Coding Genes in Microbial Genomes

被引:16
|
作者
Yu, Jia-Feng [1 ,2 ]
Xiao, Ke [1 ]
Jiang, Dong-Ke [1 ]
Guo, Jing [1 ]
Wang, Ji-Hua [2 ]
Sun, Xiao [1 ]
机构
[1] Southeast Univ, Sch Biol Sci & Med Engn, State Key Lab Bioelect, Nanjing 210096, Jiangsu, Peoples R China
[2] Dezhou Univ, Dept Phys, Shandong Prov Key Lab Biophys Funct Macromol, Dezhou 253023, Peoples R China
基金
中国国家自然科学基金;
关键词
protein-coding gene; microbial genome; re-annotation; horizontal gene transfer; HORIZONTALLY TRANSFERRED GENES; RE-ANNOTATION; GRAPHICAL REPRESENTATION; ESCHERICHIA-COLI; DNA-SEQUENCE; CODON USAGE; BACTERIAL; ORFS; IDENTIFICATION; REANNOTATION;
D O I
10.1093/dnares/dsr030
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The falsely annotated protein-coding genes have been deemed one of the major causes accounting for the annotating errors in public databases. Although many filtering approaches have been designed for the over-annotated protein-coding genes, some are questionable due to the resultant increase in false negative. Furthermore, there is no webserver or software specifically devised for the problem of over-annotation. In this study, we propose an integrative algorithm for detecting the over-annotated protein-coding genes in microorganisms. Overall, an average accuracy of 99.94% is achieved over 61 microbial genomes. The extremely high accuracy indicates that the presented algorithm is efficient to differentiate the protein-coding genes from the non-coding open reading frames. Abundant analyses show that the predicting results are reliable and the integrative algorithm is robust and convenient. Our analysis also indicates that the over-annotated protein-coding genes can cause the false positive of horizontal gene transfers detection. The webserver of the proposed algorithm can be freely accessible from www.cbi.seu.edu.cn/RPGM.
引用
收藏
页码:435 / 449
页数:15
相关论文
共 50 条
  • [21] Introns and the origin of protein-coding genes
    Senapathy, P.
    Bettrolaet, B.L.
    Siedel, H.M.
    Knowles, J.R.
    Sroltzfus, A.
    Spencer, D.F.
    Zuker, M.
    Logsdon, J.M.
    Doolittle, W.F.
    Science, 1995, 268 (5215)
  • [22] Protein-Coding Genes' Retrocopies and Their Functions
    Kubiak, Magdalena Regina
    Makalowska, Izabela
    VIRUSES-BASEL, 2017, 9 (04):
  • [23] Introns in protein-coding genes in Archaea
    Watanabe, Y
    Yokobori, S
    Inaba, T
    Yamagishi, A
    Oshima, T
    Kawarabayasi, Y
    Kikuchi, H
    Kita, K
    FEBS LETTERS, 2002, 510 (1-2) : 27 - 30
  • [24] Origins of new protein-coding genes
    不详
    SCIENCE, 2021, 371 (6531) : 779 - 780
  • [25] A simple method for estimating the intensity of purifying selection in protein-coding genes
    Ophir, R
    Itoh, T
    Graur, D
    Gojobori, T
    MOLECULAR BIOLOGY AND EVOLUTION, 1999, 16 (01) : 49 - 53
  • [26] PROMOTER SEQUENCES OF EUKARYOTIC PROTEIN-CODING GENES
    CHAMBON, P
    HOPPE-SEYLERS ZEITSCHRIFT FUR PHYSIOLOGISCHE CHEMIE, 1981, 362 (04): : 381 - 381
  • [27] Computational prediction of eukaryotic protein-coding genes
    Zhang, MQ
    NATURE REVIEWS GENETICS, 2002, 3 (09) : 698 - 709
  • [28] Selfish DNA in protein-coding genes of Rickettsia
    Ogata, H
    Audic, S
    Barbe, V
    Artiguenave, F
    Fournier, PE
    Raoult, D
    Claverie, JM
    SCIENCE, 2000, 290 (5490) : 347 - 350
  • [29] Quantifying the Mutational Robustness of Protein-Coding Genes
    Ferrada, Evandro
    JOURNAL OF MOLECULAR EVOLUTION, 2021, 89 (06) : 357 - 369
  • [30] INTRONS AND THE ORIGIN OF PROTEIN-CODING GENES - REPLY
    STOLTZFUS, A
    SPENCER, DF
    ZUKER, M
    LOGSDON, JM
    DOOLITTLE, WF
    SCIENCE, 1995, 268 (5215) : 1367 - 1369