Enhancing Long-Read-Based Strain-Aware Metagenome Assembly

被引:6
|
作者
Luo, Xiao [1 ,2 ]
Kang, Xiongbin [1 ]
Schoenhuth, Alexander [1 ,2 ]
机构
[1] Bielefeld Univ, Fac Technol, Genome Data Sci, Bielefeld, Germany
[2] Ctr Wiskunde & Informat, Life Sci & Hlth, Amsterdam, Netherlands
基金
欧盟地平线“2020”;
关键词
long reads; haplotype; strain; metagenome; genome assembly; SINGLE-CELL;
D O I
10.3389/fgene.2022.868280
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Microbial communities are usually highly diverse and often involve multiple strains from the participating species due to the rapid evolution of microorganisms. In such a complex microecosystem, different strains may show different biological functions. While reconstruction of individual genomes at the strain level is vital for accurately deciphering the composition of microbial communities, the problem has largely remained unresolved so far. Next-generation sequencing has been routinely used in metagenome assembly but there have been struggles to generate strain-specific genome sequences due to the short-read length. This explains why long-read sequencing technologies have recently provided unprecedented opportunities to carry out haplotype- or strain-resolved genome assembly. Here, we propose MetaBooster and MetaBooster-HiFi, as two pipelines for strain-aware metagenome assembly from PacBio CLR and Oxford Nanopore long-read sequencing data. Benchmarking experiments on both simulated and real sequencing data demonstrate that either the MetaBooster or the MetaBooster-HiFi pipeline drastically outperforms the state-of-the-art de novo metagenome assemblers, in terms of all relevant metagenome assembly criteria, involving genome fraction, contig length, and error rates.
引用
收藏
页数:8
相关论文
共 47 条
  • [41] Two long read-based genome assembly and annotation of polyploidy woody plants, Hibiscus syriacus L. using PacBio and Nanopore platforms
    Hyunjin Koo
    Gir-Won Lee
    Seo-Rin Ko
    Sangjin Go
    Suk-Yoon Kwon
    Yong-Min Kim
    Ah-Young Shin
    Scientific Data, 10
  • [42] TrEMOLO: accurate transposable element allele frequency estimation using long-read sequencing data combining assembly and mapping-based approaches
    Mourdas Mohamed
    François Sabot
    Marion Varoqui
    Bruno Mugat
    Katell Audouin
    Alain Pélisson
    Anna-Sophie Fiston-Lavier
    Séverine Chambeyron
    Genome Biology, 24
  • [43] TrEMOLO: accurate transposable element allele frequency estimation using long-read sequencing data combining assembly and mapping-based approaches
    Mohamed, Mourdas
    Sabot, Francois
    Varoqui, Marion
    Mugat, Bruno
    Audouin, Katell
    Pelisson, Alain
    Fiston-Lavier, Anna-Sophie
    Chambeyron, Severine
    GENOME BIOLOGY, 2023, 24 (01)
  • [44] Long-read based assembly and synteny analysis of a reference Drosophila subobscura genome reveals signatures of structural evolution driven by inversions recombination-suppression effects
    Charikleia Karageorgiou
    Víctor Gámez-Visairas
    Rosa Tarrío
    Francisco Rodríguez-Trelles
    BMC Genomics, 20
  • [45] Long-read based assembly and synteny analysis of a reference Drosophila subobscura genome reveals signatures of structural evolution driven by inversions recombination-suppression effects
    Karageorgiou, Charikleia
    Gamez-Visairas, Victor
    Tarrio, Rosa
    Rodriguez-Trelles, Francisco
    BMC GENOMICS, 2019, 20 (1)
  • [46] De Novo Assembly of the Dirofilaria immitis Genome by Long-Read Nanopore-Based Sequencing Technology on an Adult Worm from a Canine Cardiopulmonary Dirofilariosis Case
    Gomes-de-Sa, Sonia
    Barradas, Patricia
    Queiros-Reis, Luis
    Matas, Isabel M.
    Amorim, Irina
    Cardoso, Luis
    Munoz-Merida, Antonio
    Mesquita, Joao R.
    ANIMALS, 2022, 12 (11):
  • [47] Long-read assembly and comparative evidence-based reanalysis of Cryptosporidium genome sequences reveal expanded transporter repertoire and duplication of entire chromosome ends including subtelomeric regions
    Baptista, Rodrigo P.
    Li, Yiran
    Sateriale, Adam
    Sanders, Mandy J.
    Brooks, Karen L.
    Tracey, Alan
    Ansell, Brendan R. E.
    Jex, Aaron R.
    Cooper, Garrett W.
    Smith, Ethan D.
    Xiao, Rui
    Dumaine, Jennifer E.
    Georgeson, Peter
    Pope, Bernard J.
    Berriman, Matthew
    Striepen, Boris
    Cotton, James A.
    Kissinger, Jessica C.
    GENOME RESEARCH, 2022, 32 (01) : 203 - 213