Automated Prediction and Annotation of Small Open Reading Frames in Microbial Genomes

被引:24
|
作者
Durrant, Matthew G. [1 ,2 ]
Bhatt, Ami S. [1 ,2 ]
机构
[1] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Med Hematol Blood & Marrow Transplantat, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
RNA; ALIGNMENT; BACTERIAL; PROTEINS; HIDDEN; SUITE;
D O I
10.1016/j.chom.2020.11.002
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Small open reading frames (smORFs) and their encoded microproteins play central roles in microbes. However, there is a vast unexplored space of smORFs within human-associated microbes. A recent bioinformatic analysis used evolutionary conservation signals to enhance prediction of small protein families. To facilitate the annotation of specific smORFs, we introduce SmORFinder. This tool combines profile hidden Markov models of each smORF family and deep learning models that better generalize to smORF families not seen in the training set, resulting in predictions enriched for Ribo-seq translation signals. Feature importance analysis reveals that the deep learning models learn to identify Shine-Dalgarno sequences, deprioritize the wobble position in each codon, and group codon synonyms found in the codon table. A core-genome analysis of 26 bacterial species identifies several core smORFs of unknown function. We pre-compute smORF annotations for thousands of RefSeq isolate genomes and Human Microbiome Project metagenomes and provide these data through a public web portal.
引用
收藏
页码:121 / +
页数:15
相关论文
共 50 条
  • [41] Identification of Small Open Reading Frames in the Glaciozyma antarctica Genome
    Mat-Sharani, Shuhaila
    Bharudin, Izwan
    Zainuddin, Nursyafigi
    Abdul-Murad, Abdul-Munir
    Abu-Bakar, Farah-Diba
    Najimuddin, Nazalan
    Mahadi, Nor-Muhammad
    Firdaus-Raih, Mohd
    2015 UKM FST POSTGRADUATE COLLOQUIUM, 2015, 1678
  • [42] AltORFev facilitates the prediction of alternative open reading frames in eukaryotic mRNAs
    Kochetov, Alex V.
    Allmer, Jens
    Klimenko, Alexandra I.
    Zuraev, Bulat S.
    Matushkin, Yury G.
    Lashin, Sergey A.
    BIOINFORMATICS, 2017, 33 (06) : 923 - 925
  • [43] smORFunction: a tool for predicting functions of small open reading frames and microproteins
    Ji, Xiangwen
    Cui, Chunmei
    Cui, Qinghua
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [44] smORFunction: a tool for predicting functions of small open reading frames and microproteins
    Xiangwen Ji
    Chunmei Cui
    Qinghua Cui
    BMC Bioinformatics, 21
  • [45] Translation of Small Open Reading Frames: Roles in Regulation and Evolutionary Innovation
    Ruiz-Orera, Jorge
    Alba, M. Mar
    TRENDS IN GENETICS, 2019, 35 (03) : 186 - 198
  • [46] Substantial expression of novel small open reading frames in Oryza sativa
    Okamoto, Masanori
    Higuchi-Takeuchi, Mieko
    Shimizu, Minami
    Shinozaki, Kazuo
    Hanada, Kousuke
    PLANT SIGNALING & BEHAVIOR, 2014, 9 (02)
  • [47] Small Open Reading Frames, How to Find Them and Determine Their Function
    Kute, Preeti Madhav
    Soukarieh, Omar
    Tjeldnes, Hakon
    Tregouet, David-Alexandre
    Valen, Eivind
    FRONTIERS IN GENETICS, 2022, 12
  • [48] Open reading frames provide a rich pool of potential natural antisense transcripts in fungal genomes
    Steigele, S
    Nieselt, K
    NUCLEIC ACIDS RESEARCH, 2005, 33 (16) : 5034 - 5044
  • [49] Translation of Overlapping Open Reading Frames Promoted by Type 2 IRESs in Avian Calicivirus Genomes
    Arhab, Yani
    Pestova, Tatyana V.
    Hellen, Christopher U. T.
    VIRUSES-BASEL, 2024, 16 (09):
  • [50] Promoter prediction and annotation of microbial genomes based on DNA sequence and structural responses to superhelical stress
    Huiquan Wang
    Craig J Benham
    BMC Bioinformatics, 7