Improved eukaryotic detection compatible with large-scale automated analysis of metagenomes

被引:2
|
作者
Bazant, Wojtek [1 ]
Blevins, Ann S. S. [2 ]
Crouch, Kathryn [1 ]
Beiting, Daniel P. P. [2 ]
机构
[1] Univ Glasgow, Inst Infect Immun & Inflammat, Coll Med Vet & Life Sci, Glasgow, Scotland
[2] Univ Penn, Sch Vet Med, Dept Pathobiol, Philadelphia, PA 19104 USA
关键词
Metagenome; Shotgun metagenomics; Microbial eukaryotes; Bioinformatics; Fungi; Mycobiome; GUT MICROBIOME; BIOLOGY;
D O I
10.1186/s40168-023-01505-1
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Background Eukaryotes such as fungi and protists frequently accompany bacteria and archaea in microbial communities. Unfortunately, their presence is difficult to study with "shotgun" metagenomic sequencing since prokaryotic signals dominate in most environments. Recent methods for eukaryotic detection use eukaryote-specific marker genes, but they do not incorporate strategies to handle the presence of eukaryotes that are not represented in the reference marker gene set, and they are not compatible with web-based tools for downstream analysis.Results Here, we present CORRAL (for Clustering Of Related Reference ALignments), a tool for the identification of eukaryotes in shotgun metagenomic data based on alignments to eukaryote-specific marker genes and Markov clustering. Using a combination of simulated datasets, mock community standards, and large publicly available human microbiome studies, we demonstrate that our method is not only sensitive and accurate but is also capable of inferring the presence of eukaryotes not included in the marker gene reference, such as novel strains. Finally, we deploy CORRAL on our MicrobiomeDB.org resource, producing an atlas of eukaryotes present in various environments of the human body and linking their presence to study covariates.Conclusions CORRAL allows eukaryotic detection to be automated and carried out at scale. Implementation of CORRAL in MicrobiomeDB.org creates a running atlas of microbial eukaryotes in metagenomic studies. Since our approach is independent of the reference used, it may be applicable to other contexts where shotgun metagenomic reads are matched against redundant but non-exhaustive databases, such as the identification of bacterial virulence genes or taxonomic classification of viral reads.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Improved eukaryotic detection compatible with large-scale automated analysis of metagenomes
    Wojtek Bazant
    Ann S. Blevins
    Kathryn Crouch
    Daniel P. Beiting
    [J]. Microbiome, 11
  • [2] Improved Quadrant Analysis for Large-Scale Events Detection in Turbulent Transport
    Wang, Ye
    Wang, Baomin
    Lan, Changxing
    Fang, Renzhi
    Zheng, Baofeng
    Lu, Jieying
    Zheng, Dan
    [J]. ATMOSPHERE, 2022, 13 (03)
  • [3] Automated Detection of Load Changes in Large-Scale Networks
    Mata, Felipe
    Aracil, Javier
    Luis Garcia-Dorado, Jose
    [J]. TRAFFIC MONITORING AND ANALYSIS: FIRST INTERNATIONAL WORKSHOP, TMA 2009, 2009, 5537 : 34 - 41
  • [4] LADS: Large-scale automated DDoS detection system
    Sekar, Vyas
    Duffield, Nick
    Spatscheck, Oliver
    van der Merwe, Jacobus
    Zhang, Hui
    [J]. USENIX ASSOCIATION PROCEEDINGS OF THE 2006 USENIX ANNUAL TECHNICAL CONFERENCE, 2006, : 171 - +
  • [5] Automated detection of filaments in the large-scale structure of the Universe
    Gonzalez, Roberto E.
    Padilla, Nelson D.
    [J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2010, 407 (03) : 1449 - 1463
  • [6] Automated detection of dental artifacts for large-scale radiomic analysis in radiation oncology
    Arrowsmith, Colin
    Reiazi, Reza
    Welch, Mattea L.
    Kazmierski, Michal
    Patel, Tirth
    Rezaie, Aria
    Tadic, Tony
    Bratman, Scott
    Haibe-Kains, Benjamin
    [J]. PHYSICS & IMAGING IN RADIATION ONCOLOGY, 2021, 18 : 41 - 47
  • [7] xMSanalyzer: automated pipeline for improved feature detection and downstream analysis of large-scale, non-targeted metabolomics data
    Karan Uppal
    Quinlyn A Soltow
    Frederick H Strobel
    W Stephen Pittard
    Kim M Gernert
    Tianwei Yu
    Dean P Jones
    [J]. BMC Bioinformatics, 14
  • [8] Large-scale analysis of phosphorylation site occupancy in eukaryotic proteins
    Rao, R. Shyama Prasad
    Moller, Ian Max
    [J]. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS, 2012, 1824 (03): : 405 - 412
  • [9] xMSanalyzer: automated pipeline for improved feature detection and downstream analysis of large-scale, non-targeted metabolomics data
    Uppal, Karan
    Soltow, Quinlyn A.
    Strobel, Frederick H.
    Pittard, W. Stephen
    Gernert, Kim M.
    Yu, Tianwei
    Jones, Dean P.
    [J]. BMC BIOINFORMATICS, 2013, 14
  • [10] An Automated Bot Detection System through Honeypots for Large-Scale
    Haltas, Fatih
    Uzun, Erkam
    Siseci, Necati
    Posul, Abdulkadir
    Emre, Bakir
    [J]. 2014 6TH INTERNATIONAL CONFERENCE ON CYBER CONFLICT (CYCON 2014), 2014, : 255 - +