An Integrative Method for Identifying the Over-Annotated Protein-Coding Genes in Microbial Genomes

被引:16
|
作者
Yu, Jia-Feng [1 ,2 ]
Xiao, Ke [1 ]
Jiang, Dong-Ke [1 ]
Guo, Jing [1 ]
Wang, Ji-Hua [2 ]
Sun, Xiao [1 ]
机构
[1] Southeast Univ, Sch Biol Sci & Med Engn, State Key Lab Bioelect, Nanjing 210096, Jiangsu, Peoples R China
[2] Dezhou Univ, Dept Phys, Shandong Prov Key Lab Biophys Funct Macromol, Dezhou 253023, Peoples R China
基金
中国国家自然科学基金;
关键词
protein-coding gene; microbial genome; re-annotation; horizontal gene transfer; HORIZONTALLY TRANSFERRED GENES; RE-ANNOTATION; GRAPHICAL REPRESENTATION; ESCHERICHIA-COLI; DNA-SEQUENCE; CODON USAGE; BACTERIAL; ORFS; IDENTIFICATION; REANNOTATION;
D O I
10.1093/dnares/dsr030
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The falsely annotated protein-coding genes have been deemed one of the major causes accounting for the annotating errors in public databases. Although many filtering approaches have been designed for the over-annotated protein-coding genes, some are questionable due to the resultant increase in false negative. Furthermore, there is no webserver or software specifically devised for the problem of over-annotation. In this study, we propose an integrative algorithm for detecting the over-annotated protein-coding genes in microorganisms. Overall, an average accuracy of 99.94% is achieved over 61 microbial genomes. The extremely high accuracy indicates that the presented algorithm is efficient to differentiate the protein-coding genes from the non-coding open reading frames. Abundant analyses show that the predicting results are reliable and the integrative algorithm is robust and convenient. Our analysis also indicates that the over-annotated protein-coding genes can cause the false positive of horizontal gene transfers detection. The webserver of the proposed algorithm can be freely accessible from www.cbi.seu.edu.cn/RPGM.
引用
收藏
页码:435 / 449
页数:15
相关论文
共 50 条
  • [41] Identify Protein-coding Genes in the Genomes of Aeropyrum pernix K1 and Chlorobium tepidum TLS
    Guo, Feng-Biao
    Lin, Yan
    JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2009, 26 (04): : 413 - 420
  • [42] Gene prediction by spectral rotation measure: A new method for identifying protein-coding regions
    Kotlar, D
    Lavner, Y
    GENOME RESEARCH, 2003, 13 (08) : 1930 - 1937
  • [43] Proteomic Detection of Non-Annotated Protein-Coding Genes in Pseudomonas fluorescens Pf0-1
    Kim, Wook
    Silby, Mark W.
    Purvine, Sam O.
    Nicoll, Julie S.
    Hixson, Kim K.
    Monroe, Matt
    Nicora, Carrie D.
    Lipton, Mary S.
    Levy, Stuart B.
    PLOS ONE, 2009, 4 (12):
  • [44] Systematic analyses of the cancer genome: lessons learned from sequencing most of the annotated human protein-coding genes
    Sjoblom, Tobias
    CURRENT OPINION IN ONCOLOGY, 2008, 20 (01) : 66 - 71
  • [45] The Functional Meaning of 5′UTR in Protein-Coding Genes
    Ryczek, Natalia
    Lys, Aneta
    Makalowska, Izabela
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (03)
  • [46] Expression of protein-coding genes embedded in ribosomal DNA
    Johansen, Steinar D.
    Haugen, Peik
    Nielsen, Henrik
    BIOLOGICAL CHEMISTRY, 2007, 388 (07) : 679 - 686
  • [47] Expression of mitochondrial protein-coding genes in Tetrahymena pyriformis
    Edqvist, J
    Burger, G
    Gray, MW
    JOURNAL OF MOLECULAR BIOLOGY, 2000, 297 (02) : 381 - 393
  • [48] Ovule siRNAs methylate protein-coding genes in trans
    Burgess, Diane
    Chow, Hiu Tung
    Grover, Jeffrey W.
    Freeling, Michael
    Mosher, Rebecca A.
    PLANT CELL, 2022, 34 (10): : 3647 - 3664
  • [49] Phylogenetic informativeness of mitochondrial protein-coding genes in Nematoda
    Ma, X.
    Baeza, J. A.
    Richards, V.
    Agudelo, P. A.
    PHYTOPATHOLOGY, 2020, 110 (07) : 4 - 4
  • [50] POSITIONING OF PROTEIN-CODING GENES ON THE SOYBEAN CHLOROPLAST GENOME
    SINGH, GP
    WALLEN, DG
    PILLAY, DTN
    PLANT MOLECULAR BIOLOGY, 1985, 4 (2-3) : 87 - 93