Efficient Recovery of Complete Gut Viral Genomes by Combined Short- and Long-Read Sequencing

被引:0
|
作者
Chen, Jingchao [1 ]
Sun, Chuqing [1 ]
Dong, Yanqi [2 ,3 ]
Jin, Menglu [1 ,4 ]
Lai, Senying [2 ,3 ]
Jia, Longhao [2 ,3 ]
Zhao, Xueyang [4 ]
Wang, Huarui [1 ]
Gao, Na L. [1 ,5 ]
Bork, Peer [6 ,7 ,8 ,9 ]
Liu, Zhi [10 ]
Chen, Wei-Hua [1 ,4 ,11 ]
Zhao, Xing-Ming [2 ,3 ,12 ,13 ,14 ,15 ]
机构
[1] Huazhong Univ Sci & Technol, Hubei Key Lab Bioinformat & Mol Imaging, Ctr Artificial Intelligence Biol, Key Lab Mol Biophys,Minist Educ,Coll Life Sci & T, Wuhan 430074, Hubei, Peoples R China
[2] Fudan Univ, Zhongshan Hosp, Dept Neurol, Shanghai 200433, Peoples R China
[3] Fudan Univ, Inst Sci & Technol Brain Inspired Intelligence, Shanghai 200433, Peoples R China
[4] Henan Normal Univ, Coll Life Sci, Xinxiang 453007, Henan, Peoples R China
[5] Wuhan Univ, Dept Lab Med, Zhongnan Hosp, Wuhan 430071, Peoples R China
[6] Struct & Computat Biol Unit, European Mol Biol Lab, D-69117 Heidelberg, Germany
[7] Max Delbruck Ctr Mol Med, D-13125 Berlin, Germany
[8] Yonsei Univ, Yonsei Frontier Lab YFL, Seoul 03722, South Korea
[9] Univ Wurzburg, Dept Bioinformat, Bioctr, D-97070 Wurzburg, Germany
[10] Huazhong Univ Sci & Technol, Coll Life Sci & Technol, Dept Biotechnol, Wuhan 430074, Peoples R China
[11] Binzhou Med Univ, Inst Med Artificial Intelligence, Yantai 264003, Peoples R China
[12] Fudan Univ, MOE Key Lab Computat Neurosci & Brain Inspired In, Shanghai 200433, Peoples R China
[13] Fudan Univ, MOE Frontiers Ctr Brain Sci, Shanghai 200433, Peoples R China
[14] Fudan Univ, Inst Brain Sci, State Key Lab Med Neurobiol, Shanghai 200433, Peoples R China
[15] Int Human Phenome Inst Shanghai, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金;
关键词
crAssphage; gubaphage; gut virome; long-read sequencing; pacBio sequel II; terminase; virus-like particle; INTESTINAL MICROBIOME; SIGNATURE GENES; SINGLE-CELL; VIRUS; VIROME; DNA; BACTERIOPHAGES; IDENTIFICATION; ASSOCIATION; ALIGNMENT;
D O I
10.1002/advs.202305818
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Current metagenome assembled human gut phage catalogs contained mostly fragmented genomes. Here, comprehensive gut virome detection procedure is developed involving virus-like particle (VLP) enrichment from approximate to 500 g feces and combined sequencing of short- and long-read. Applied to 135 samples, a Chinese Gut Virome Catalog (CHGV) is assembled consisting of 21,499 non-redundant viral operational taxonomic units (vOTUs) that are significantly longer than those obtained by short-read sequencing and contained approximate to 35% (7675) complete genomes, which is approximate to nine times more than those in the Gut Virome Database (GVD, approximate to 4%, 1,443). Interestingly, the majority (approximate to 60%, 13,356) of the CHGV vOTUs are obtained by either long-read or hybrid assemblies, with little overlap with those assembled from only the short-read data. With this dataset, vast diversity of the gut virome is elucidated, including the identification of 32% (6,962) novel vOTUs compare to public gut virome databases, dozens of phages that are more prevalent than the crAssphages and/or Gubaphages, and several viral clades that are more diverse than the two. Finally, the functional capacities are also characterized of the CHGV encoded proteins and constructed a viral-host interaction network to facilitate future research and applications. The CHGV, a human gut virome database, comprises 21,499 non-redundant phage genomes obtaine through viral-like particle enrichment and combined short- and long-read sequencing. The long-reads facilitate the identification of more complete viral genomes and surprisingly favor lowly abundant ones. The CHGV features novel genomes, prevalent phages exceeding crAssphages/Gubaphages, and enables functional analysis and viral-host interaction network construction. This enhances gut virome research and applications significantly.image
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Complete nontuberculous mycobacteria whole genomes using an optimized DNA extraction protocol for long-read sequencing
    Bouso, Jennifer M.
    Planet, Paul J.
    BMC GENOMICS, 2019, 20 (01) : 793
  • [22] Characterization of Genomic Heterogeneity in rAAV Preparations Using Short- and Long-Read Next Generation Sequencing
    Strauss, Tina
    Diallo, Alpha B.
    Zolotukhin, Irene
    Tseng, Elizabeth
    Weber, Kristina
    Burg, Matthew
    Patel, Nilay
    Palaschak, Brett
    Coleman, Stewart
    Marlowe, Jennifer L.
    Cockrell, Adam S.
    MOLECULAR THERAPY, 2023, 31 (04) : 611 - 612
  • [23] Florfenicol and oxazolidone resistance status in livestock farms revealed by short- and long-read metagenomic sequencing
    Yang, Xue
    Zhang, Tiejun
    Lei, Chang-Wei
    Wang, Qin
    Huang, Zheren
    Chen, Xuan
    Wang, Hong-Ning
    FRONTIERS IN MICROBIOLOGY, 2022, 13
  • [24] Transcriptome sequencing in narrow-leafed lupin (Lupinus angustifolius): Combining short- and long-read sequencing platforms
    Yang, T.
    Nagy, I.
    Asp, T.
    Geu-Flores, F.
    PLANTA MEDICA, 2016, 82
  • [25] Comparative evaluation of SNVs, indels, and structural variations detected with short- and long-read sequencing data
    Kosugi, Shunichi
    Terao, Chikashi
    HUMAN GENOME VARIATION, 2024, 11 (01)
  • [26] VISOR: a versatile haplotype-aware structural variant simulator for short- and long-read sequencing
    Bolognini, Davide
    Sanders, Ashley
    Korbel, Jan O.
    Magi, Alberto
    Benes, Vladimir
    Rausch, Tobias
    BIOINFORMATICS, 2020, 36 (04) : 1267 - 1269
  • [27] Lerna: transformer architectures for configuring error correction tools for short- and long-read genome sequencing
    Atul Sharma
    Pranjal Jain
    Ashraf Mahgoub
    Zihan Zhou
    Kanak Mahadik
    Somali Chaterji
    BMC Bioinformatics, 23
  • [28] Comparison of Long-Read Methods for Sequencing and Assembly of Lepidopteran Pest Genomes
    Zhang, Tong
    Xing, Weiqing
    Wang, Aoming
    Zhang, Na
    Jia, Ling
    Ma, Sanyuan
    Xia, Qingyou
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (01)
  • [29] Comparison of long-read sequencing technologies in interrogating bacteria and fly genomes
    Tvedte, Eric S.
    Gasser, Mark
    Sparklin, Benjamin C.
    Michalski, Jane
    Hjelmen, Carl E.
    Johnston, J. Spencer
    Zhao, Xuechu
    Bromley, Robin
    Tallon, Luke J.
    Sadzewicz, Lisa
    Rasko, David A.
    Hotopp, Julie C. Dunning
    G3-GENES GENOMES GENETICS, 2021, 11 (06):
  • [30] Long-Read Sequencing Improves Recovery of Picoeukaryotic Genomes and Zooplankton Marker Genes from Marine Metagenomes
    Patin, N. V.
    Goodwin, K. D.
    MSYSTEMS, 2022, 7 (06)