Deep learning for predicting 16S rRNA gene copy number

被引:0
|
作者
Miao, Jiazheng [1 ,3 ]
Chen, Tianlai [1 ,4 ]
Misir, Mustafa [1 ]
Lin, Yajuan [1 ,2 ]
机构
[1] Duke Kunshan Univ, Div Nat & Appl Sci, Suzhou, Peoples R China
[2] Texas A&M Univ Corpus Christi, Dept Life Sci, Corpus Christi, TX 78412 USA
[3] Harvard Med Sch, Dept Biomed Informat, Boston, MA USA
[4] Duke Univ, Dept Biomed Engn, Durham, NC USA
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
关键词
CHARACTERS; REGRESSION; DIVERSITY; PARSIMONY; ABUNDANCE; BACTERIA; DATABASE; ARCHAEA; TOOLS; MODEL;
D O I
10.1038/s41598-024-64658-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Culture-independent 16S rRNA gene metabarcoding is a commonly used method for microbiome profiling. To achieve more quantitative cell fraction estimates, it is important to account for the 16S rRNA gene copy number (hereafter 16S GCN) of different community members. Currently, there are several bioinformatic tools available to estimate the 16S GCN values, either based on taxonomy assignment or phylogeny. Here we present a novel approach ANNA16, Artificial Neural Network Approximator for 16S rRNA gene copy number, a deep learning-based method that estimates the 16S GCN values directly from the 16S gene sequence strings. Based on 27,579 16S rRNA gene sequences and gene copy number data from the rrnDB database, we show that ANNA16 outperforms the commonly used 16S GCN prediction algorithms. Interestingly, Shapley Additive exPlanations (SHAP) shows that ANNA16 can identify unexpected informative positions in 16S rRNA gene sequences without any prior phylogenetic knowledge, which suggests potential applications beyond 16S GCN prediction.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Copy number of the 16S rRNA gene in Coxiella burnetii
    Afseth, G
    Mallavia, LP
    EUROPEAN JOURNAL OF EPIDEMIOLOGY, 1997, 13 (06) : 729 - 731
  • [2] Copy number of the 16S rRNA gene in Coxiella burnetii
    Guy Afseth
    Louis P. Mallavia
    European Journal of Epidemiology, 1997, 13 : 729 - 731
  • [3] Sequence and copy number of the Xanthomonas campestris pv campestris gene encoding 16S rRNA
    Lin, NT
    Tseng, YH
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 1997, 235 (02) : 276 - 280
  • [4] 16S rRNA Gene Copy Number Normalization Does Not Provide More Reliable Conclusions in Metataxonomic Surveys
    Starke, Robert
    Pylro, Victor Satler
    Morais, Daniel Kumazawa
    MICROBIAL ECOLOGY, 2021, 81 (02) : 535 - 539
  • [5] 16S rRNA Gene Copy Number Normalization Does Not Provide More Reliable Conclusions in Metataxonomic Surveys
    Robert Starke
    Victor Satler Pylro
    Daniel Kumazawa Morais
    Microbial Ecology, 2021, 81 : 535 - 539
  • [6] Variable Copy Number, Intra-Genomic Heterogeneities and Lateral Transfers of the 16S rRNA Gene in Pseudomonas
    Bodilis, Josselin
    Nsigue-Meilo, Sandrine
    Besaury, Ludovic
    Quillet, Laurent
    PLOS ONE, 2012, 7 (04):
  • [7] Fast and accurate average genome size and 16S rRNA gene average copy number computation in metagenomic data
    Emiliano Pereira-Flores
    Frank Oliver Glöckner
    Antonio Fernandez-Guerra
    BMC Bioinformatics, 20
  • [8] Fast and accurate average genome size and 16S rRNA gene average copy number computation in metagenomic data
    Pereira-Flores, Emiliano
    Gloeckner, Frank Oliver
    Fernandez-Guerra, Antonio
    BMC BIOINFORMATICS, 2019, 20 (01) : 453
  • [9] Estimation of 16S rRNA gene copy number in several probiotic Lactobacillus strains isolated from the gastrointestinal tract of chicken
    Lee, Chin Mei
    Sieo, Chin Chin
    Abdullah, Norhani
    Ho, Yin Wan
    FEMS MICROBIOLOGY LETTERS, 2008, 287 (01) : 136 - 141
  • [10] A renaissance for the pioneering 16S rRNA gene
    Tringe, Susannah G.
    Hugenholtz, Philip
    CURRENT OPINION IN MICROBIOLOGY, 2008, 11 (05) : 442 - 446