Enhancing Long-Read-Based Strain-Aware Metagenome Assembly

被引:6
|
作者
Luo, Xiao [1 ,2 ]
Kang, Xiongbin [1 ]
Schoenhuth, Alexander [1 ,2 ]
机构
[1] Bielefeld Univ, Fac Technol, Genome Data Sci, Bielefeld, Germany
[2] Ctr Wiskunde & Informat, Life Sci & Hlth, Amsterdam, Netherlands
基金
欧盟地平线“2020”;
关键词
long reads; haplotype; strain; metagenome; genome assembly; SINGLE-CELL;
D O I
10.3389/fgene.2022.868280
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Microbial communities are usually highly diverse and often involve multiple strains from the participating species due to the rapid evolution of microorganisms. In such a complex microecosystem, different strains may show different biological functions. While reconstruction of individual genomes at the strain level is vital for accurately deciphering the composition of microbial communities, the problem has largely remained unresolved so far. Next-generation sequencing has been routinely used in metagenome assembly but there have been struggles to generate strain-specific genome sequences due to the short-read length. This explains why long-read sequencing technologies have recently provided unprecedented opportunities to carry out haplotype- or strain-resolved genome assembly. Here, we propose MetaBooster and MetaBooster-HiFi, as two pipelines for strain-aware metagenome assembly from PacBio CLR and Oxford Nanopore long-read sequencing data. Benchmarking experiments on both simulated and real sequencing data demonstrate that either the MetaBooster or the MetaBooster-HiFi pipeline drastically outperforms the state-of-the-art de novo metagenome assemblers, in terms of all relevant metagenome assembly criteria, involving genome fraction, contig length, and error rates.
引用
收藏
页数:8
相关论文
共 47 条
  • [31] Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data
    Liu, Yichen Henry
    Luo, Can
    Golding, Staunton G.
    Ioffe, Jacob B.
    Zhou, Xin Maizie
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [32] Ultraplexing: increasing the efficiency of long-read sequencing for hybrid assembly with k-mer-based multiplexing
    Dilthey, Alexander T.
    Meyer, Sebastian A.
    Kaasch, Achim J.
    GENOME BIOLOGY, 2020, 21 (01)
  • [33] Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data
    Yichen Henry Liu
    Can Luo
    Staunton G. Golding
    Jacob B. Ioffe
    Xin Maizie Zhou
    Nature Communications, 15
  • [34] MinION-based long-read sequencing and assembly extends the &ITCaenorhabditis elegans&IT reference genome
    Tyson, John R.
    O'Neil, Nigel J.
    Jain, Miten
    Olsen, Hugh E.
    Hieter, Philip
    Snutch, Terrance P.
    GENOME RESEARCH, 2018, 28 (02) : 266 - 274
  • [35] Ultraplexing: increasing the efficiency of long-read sequencing for hybrid assembly with k-mer-based multiplexing
    Alexander T. Dilthey
    Sebastian A. Meyer
    Achim J. Kaasch
    Genome Biology, 21
  • [36] Enhancing STR detection in human genetics: A comparative study of Long-Read WGS and PCR-Based methods
    Wang, Ning
    Wang, Yicong
    Zhang, Wenxin
    Huang, Jie
    Li, Le
    Xue, Lingna
    Li, Tang
    Chen, Fang
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 1773 - 1774
  • [37] High-quality genome assembly and annotation of Clonostachys chloroleuca strain Cc878 using Oxford Nanopore long-read sequencing
    Zhang, Xin
    Long, Xinyuan
    Xing, Xiaoxing
    Tai, Jia
    Wang, Guanghui
    Xu, Ming
    Liu, Huiquan
    MICROBIOLOGY RESOURCE ANNOUNCEMENTS, 2024, 13 (07):
  • [38] A High-Quality Genome Assembly of Striped Catfish (Pangasianodon hypophthalmus) Based on Highly Accurate Long-Read HiFi Sequencing Data
    Dao Minh Hai
    Duong Thuy Yen
    Pham Thanh Liem
    Bui Minh Tam
    Do Thi Thanh Huong
    Bui Thi Bich Hang
    Dang Quang Hieu
    Garigliany, Mutien-Marie
    Coppieters, Wouter
    Kestemont, Patrick
    Nguyen Thanh Phuong
    Farnir, Frederic
    GENES, 2022, 13 (05)
  • [39] Common workflow language (CWL)-based software pipeline for de novo genome assembly from long- and short-read data
    Korhonen, Pasi K.
    Hall, Ross S.
    Young, Neil D.
    Gasser, Robin B.
    GIGASCIENCE, 2019, 8 (04):
  • [40] Two long read-based genome assembly and annotation of polyploidy woody plants, Hibiscus syriacus L. using PacBio and Nanopore platforms
    Koo, Hyunjin
    Lee, Gir-Won
    Ko, Seo-Rin
    Go, Sangjin
    Kwon, Suk-Yoon
    Kim, Yong-Min
    Shin, Ah-Young
    SCIENTIFIC DATA, 2023, 10 (01)