Automated Prediction and Annotation of Small Open Reading Frames in Microbial Genomes

被引:24
|
作者
Durrant, Matthew G. [1 ,2 ]
Bhatt, Ami S. [1 ,2 ]
机构
[1] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Med Hematol Blood & Marrow Transplantat, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
RNA; ALIGNMENT; BACTERIAL; PROTEINS; HIDDEN; SUITE;
D O I
10.1016/j.chom.2020.11.002
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Small open reading frames (smORFs) and their encoded microproteins play central roles in microbes. However, there is a vast unexplored space of smORFs within human-associated microbes. A recent bioinformatic analysis used evolutionary conservation signals to enhance prediction of small protein families. To facilitate the annotation of specific smORFs, we introduce SmORFinder. This tool combines profile hidden Markov models of each smORF family and deep learning models that better generalize to smORF families not seen in the training set, resulting in predictions enriched for Ribo-seq translation signals. Feature importance analysis reveals that the deep learning models learn to identify Shine-Dalgarno sequences, deprioritize the wobble position in each codon, and group codon synonyms found in the codon table. A core-genome analysis of 26 bacterial species identifies several core smORFs of unknown function. We pre-compute smORF annotations for thousands of RefSeq isolate genomes and Human Microbiome Project metagenomes and provide these data through a public web portal.
引用
收藏
页码:121 / +
页数:15
相关论文
共 50 条
  • [31] Small open reading frames in 5′ untranslated regions of mRNAs
    Yamashita, R
    Suzuki, Y
    Nakai, K
    Sugano, S
    COMPTES RENDUS BIOLOGIES, 2003, 326 (10-11) : 987 - 991
  • [32] Biologically active peptides encoded by small open reading frames
    Khazigaleeva, R. A.
    Fesenko, I. A.
    RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY, 2017, 43 (06) : 617 - 624
  • [33] Identifying Small Open Reading Frames in Prokaryotes with Ribosome Profiling
    Vazquez-Laslop, Nora
    Sharma, Cynthia M.
    Mankin, Alexander
    Buskirk, Allen R.
    JOURNAL OF BACTERIOLOGY, 2022, 204 (01)
  • [34] Hundreds of putatively functional small open reading frames in Drosophila
    Emmanuel Ladoukakis
    Vini Pereira
    Emile G Magny
    Adam Eyre-Walker
    Juan Pablo Couso
    Genome Biology, 12
  • [35] Filtering "genic" open reading frames from genomic DNA samples for advanced annotation
    D'Angelo, Sara
    Velappan, Nileena
    Mignone, Flavio
    Santoro, Claudio
    Sblattero, Daniele
    Kiss, Csaba
    Bradbury, Andrew R. M.
    BMC GENOMICS, 2011, 12
  • [36] Small open reading frames: a comparative genetics approach to validation
    Jain, Niyati
    Richter, Felix
    Adzhubei, Ivan
    Sharp, Andrew J. J.
    Gelb, Bruce D. D.
    BMC GENOMICS, 2023, 24 (01)
  • [37] Hundreds of putatively functional small open reading frames in Drosophila
    Ladoukakis, Emmanuel
    Pereira, Vini
    Magny, Emile G.
    Eyre-Walker, Adam
    Couso, Juan Pablo
    GENOME BIOLOGY, 2011, 12 (11):
  • [38] Computational discovery of small open reading frames in Bacillus lehensis
    Zainuddin, Nurhafizhoh
    Illias, Rosli Md.
    Mahadi, Nor Muhammad
    Firdaus-Raih, Mohd
    2015 UKM FST POSTGRADUATE COLLOQUIUM, 2015, 1678
  • [39] Small Open Reading Frames: How Important Are They for Molecular Evolution?
    Guerra-Almeida, Diego
    Nunes-da-Fonseca, Rodrigo
    FRONTIERS IN GENETICS, 2020, 11
  • [40] Biologically active peptides encoded by small open reading frames
    R. A. Khazigaleeva
    I. A. Fesenko
    Russian Journal of Bioorganic Chemistry, 2017, 43 : 617 - 624