Computational discovery and annotation of conserved small open reading frames in fungal genomes

被引:16
|
作者
Mat-Sharani, Shuhaila [1 ,2 ]
Firdaus-Raih, Mohd [1 ,3 ]
机构
[1] UKM, Ctr Frontier Sci, Fac Sci & Technol, Bangi 43600, Selangor, Malaysia
[2] Malaysia Genome Inst, Minist Sci Technol & Innovat, Jalan Bangi, Kajang 43000, Selangor, Malaysia
[3] UKM, Inst Syst Biol, Bangi 43600, Selangor, Malaysia
关键词
Small open Reading frames; sORFs; smORF; Conserved; Fungal; PSYCHROPHILIC YEAST; FUNCTIONAL GENOMICS; ANTIFREEZE PROTEIN; SEQUENCE; EXPRESSION; GENERATION; DATABASE; PACKAGE; GENES; BLAST;
D O I
10.1186/s12859-018-2550-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundSmall open reading frames (smORF/sORFs) that encode short protein sequences are often overlooked during the standard gene prediction process thus leading to many sORFs being left undiscovered and/or misannotated. For many genomes, a second round of sORF targeted gene prediction can complement the existing annotation. In this study, we specifically targeted the identification of ORFs encoding for 80 amino acid residues or less from 31 fungal genomes. We then compared the predicted sORFs and analysed those that are highly conserved among the genomes.ResultsA first set of sORFs was identified from existing annotations that fitted the maximum of 80 residues criterion. A second set was predicted using parameters that specifically searched for ORF candidates of 80 codons or less in the exonic, intronic and intergenic sequences of the subject genomes. A total of 1986 conserved sORFs were predicted and characterized.ConclusionsIt is evident that numerous open reading frames that could potentially encode for polypeptides consisting of 80 amino acid residues or less are overlooked during standard gene prediction and annotation. From our results, additional targeted reannotation of genomes is clearly able to complement standard genome annotation to identify sORFs. Due to the lack of, and limitations with experimental validation, we propose that a simple conservation analysis can provide an acceptable means of ensuring that the predicted sORFs are sufficiently clear of gene prediction artefacts.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Computational discovery and annotation of conserved small open reading frames in fungal genomes
    Shuhaila Mat-Sharani
    Mohd Firdaus-Raih
    [J]. BMC Bioinformatics, 19
  • [2] Automated Prediction and Annotation of Small Open Reading Frames in Microbial Genomes
    Durrant, Matthew G.
    Bhatt, Ami S.
    [J]. CELL HOST & MICROBE, 2021, 29 (01) : 121 - +
  • [3] Computational discovery of small open reading frames in Bacillus lehensis
    Zainuddin, Nurhafizhoh
    Illias, Rosli Md.
    Mahadi, Nor Muhammad
    Firdaus-Raih, Mohd
    [J]. 2015 UKM FST POSTGRADUATE COLLOQUIUM, 2015, 1678
  • [4] Conserved functions of small open reading frames
    Isabel Lokody
    [J]. Nature Reviews Genetics, 2013, 14 (10) : 679 - 679
  • [5] uORF4u: a tool for annotation of conserved upstream open reading frames
    Egorov, Artyom A.
    Atkinson, Gemma C.
    [J]. BIOINFORMATICS, 2023, 39 (05)
  • [6] Small open reading frames associated with morphogenesis are hidden in plant genomes
    Hanada, Kousuke
    Higuchi-Takeuchi, Mieko
    Okamoto, Masanori
    Yoshizumi, Takeshi
    Shimizu, Minami
    Nakaminami, Kentaro
    Nishi, Ranko
    Ohashi, Chihiro
    Iida, Kei
    Tanaka, Maho
    Horii, Yoko
    Kawashima, Mika
    Matsui, Keiko
    Toyoda, Tetsuro
    Shinozaki, Kazuo
    Seki, Motoaki
    Matsui, Minami
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (06) : 2395 - 2400
  • [7] Standardized annotation of translated open reading frames
    Mudge, Jonathan M.
    Ruiz-Orera, Jorge
    Prensner, John R.
    Brunet, Marie A.
    Calvet, Ferriol
    Jungreis, Irwin
    Gonzalez, Jose Manuel
    Magrane, Michele
    Martinez, Thomas F.
    Schulz, Jana Felicitas
    Yang, Yucheng T.
    Alba, M. Mar
    Aspden, Julie L.
    Baranov, Pavel V.
    Bazzini, Ariel A.
    Bruford, Elspeth
    Martin, Maria Jesus
    Calviello, Lorenzo
    Carvunis, Anne-Ruxandra
    Chen, Jin
    Couso, Juan Pablo
    Deutsch, Eric W.
    Flicek, Paul
    Frankish, Adam
    Gerstein, Mark
    Hubner, Norbert
    Ingolia, Nicholas T.
    Kellis, Manolis
    Menschaert, Gerben
    Moritz, Robert L.
    Ohler, Uwe
    Roucou, Xavier
    Saghatelian, Alan
    Weissman, Jonathan S.
    van Heesch, Sebastiaan
    [J]. NATURE BIOTECHNOLOGY, 2022, 40 (07) : 994 - 999
  • [8] Standardized annotation of translated open reading frames
    Jonathan M. Mudge
    Jorge Ruiz-Orera
    John R. Prensner
    Marie A. Brunet
    Ferriol Calvet
    Irwin Jungreis
    Jose Manuel Gonzalez
    Michele Magrane
    Thomas F. Martinez
    Jana Felicitas Schulz
    Yucheng T. Yang
    M. Mar Albà
    Julie L. Aspden
    Pavel V. Baranov
    Ariel A. Bazzini
    Elspeth Bruford
    Maria Jesus Martin
    Lorenzo Calviello
    Anne-Ruxandra Carvunis
    Jin Chen
    Juan Pablo Couso
    Eric W. Deutsch
    Paul Flicek
    Adam Frankish
    Mark Gerstein
    Norbert Hubner
    Nicholas T. Ingolia
    Manolis Kellis
    Gerben Menschaert
    Robert L. Moritz
    Uwe Ohler
    Xavier Roucou
    Alan Saghatelian
    Jonathan S. Weissman
    Sebastiaan van Heesch
    [J]. Nature Biotechnology, 2022, 40 : 994 - 999
  • [9] Accurate annotation of human protein-coding small open reading frames
    Thomas F. Martinez
    Qian Chu
    Cynthia Donaldson
    Dan Tan
    Maxim N. Shokhirev
    Alan Saghatelian
    [J]. Nature Chemical Biology, 2020, 16 : 458 - 468
  • [10] Accurate annotation of human protein-coding small open reading frames
    Martinez, Thomas F.
    Chu, Qian
    Donaldson, Cynthia
    Tan, Dan
    Shokhirev, Maxim N.
    Saghatelian, Alan
    [J]. NATURE CHEMICAL BIOLOGY, 2020, 16 (04) : 458 - +