Efficient Recovery of Complete Gut Viral Genomes by Combined Short- and Long-Read Sequencing

被引:0
|
作者
Chen, Jingchao [1 ]
Sun, Chuqing [1 ]
Dong, Yanqi [2 ,3 ]
Jin, Menglu [1 ,4 ]
Lai, Senying [2 ,3 ]
Jia, Longhao [2 ,3 ]
Zhao, Xueyang [4 ]
Wang, Huarui [1 ]
Gao, Na L. [1 ,5 ]
Bork, Peer [6 ,7 ,8 ,9 ]
Liu, Zhi [10 ]
Chen, Wei-Hua [1 ,4 ,11 ]
Zhao, Xing-Ming [2 ,3 ,12 ,13 ,14 ,15 ]
机构
[1] Huazhong Univ Sci & Technol, Hubei Key Lab Bioinformat & Mol Imaging, Ctr Artificial Intelligence Biol, Key Lab Mol Biophys,Minist Educ,Coll Life Sci & T, Wuhan 430074, Hubei, Peoples R China
[2] Fudan Univ, Zhongshan Hosp, Dept Neurol, Shanghai 200433, Peoples R China
[3] Fudan Univ, Inst Sci & Technol Brain Inspired Intelligence, Shanghai 200433, Peoples R China
[4] Henan Normal Univ, Coll Life Sci, Xinxiang 453007, Henan, Peoples R China
[5] Wuhan Univ, Dept Lab Med, Zhongnan Hosp, Wuhan 430071, Peoples R China
[6] Struct & Computat Biol Unit, European Mol Biol Lab, D-69117 Heidelberg, Germany
[7] Max Delbruck Ctr Mol Med, D-13125 Berlin, Germany
[8] Yonsei Univ, Yonsei Frontier Lab YFL, Seoul 03722, South Korea
[9] Univ Wurzburg, Dept Bioinformat, Bioctr, D-97070 Wurzburg, Germany
[10] Huazhong Univ Sci & Technol, Coll Life Sci & Technol, Dept Biotechnol, Wuhan 430074, Peoples R China
[11] Binzhou Med Univ, Inst Med Artificial Intelligence, Yantai 264003, Peoples R China
[12] Fudan Univ, MOE Key Lab Computat Neurosci & Brain Inspired In, Shanghai 200433, Peoples R China
[13] Fudan Univ, MOE Frontiers Ctr Brain Sci, Shanghai 200433, Peoples R China
[14] Fudan Univ, Inst Brain Sci, State Key Lab Med Neurobiol, Shanghai 200433, Peoples R China
[15] Int Human Phenome Inst Shanghai, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金;
关键词
crAssphage; gubaphage; gut virome; long-read sequencing; pacBio sequel II; terminase; virus-like particle; INTESTINAL MICROBIOME; SIGNATURE GENES; SINGLE-CELL; VIRUS; VIROME; DNA; BACTERIOPHAGES; IDENTIFICATION; ASSOCIATION; ALIGNMENT;
D O I
10.1002/advs.202305818
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Current metagenome assembled human gut phage catalogs contained mostly fragmented genomes. Here, comprehensive gut virome detection procedure is developed involving virus-like particle (VLP) enrichment from approximate to 500 g feces and combined sequencing of short- and long-read. Applied to 135 samples, a Chinese Gut Virome Catalog (CHGV) is assembled consisting of 21,499 non-redundant viral operational taxonomic units (vOTUs) that are significantly longer than those obtained by short-read sequencing and contained approximate to 35% (7675) complete genomes, which is approximate to nine times more than those in the Gut Virome Database (GVD, approximate to 4%, 1,443). Interestingly, the majority (approximate to 60%, 13,356) of the CHGV vOTUs are obtained by either long-read or hybrid assemblies, with little overlap with those assembled from only the short-read data. With this dataset, vast diversity of the gut virome is elucidated, including the identification of 32% (6,962) novel vOTUs compare to public gut virome databases, dozens of phages that are more prevalent than the crAssphages and/or Gubaphages, and several viral clades that are more diverse than the two. Finally, the functional capacities are also characterized of the CHGV encoded proteins and constructed a viral-host interaction network to facilitate future research and applications. The CHGV, a human gut virome database, comprises 21,499 non-redundant phage genomes obtaine through viral-like particle enrichment and combined short- and long-read sequencing. The long-reads facilitate the identification of more complete viral genomes and surprisingly favor lowly abundant ones. The CHGV features novel genomes, prevalent phages exceeding crAssphages/Gubaphages, and enables functional analysis and viral-host interaction network construction. This enhances gut virome research and applications significantly.image
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Reconstructing complex regions of genomes using long-read sequencing technology
    Huddleston, John
    Ranade, Swati
    Malig, Maika
    Antonacci, Francesca
    Chaisson, Mark
    Hon, Lawrence
    Sudmant, Peter H.
    Graves, Tina A.
    Alkan, Can
    Dennis, Megan Y.
    Wilson, Richard K.
    Turner, Stephen W.
    Korlach, Jonas
    Eichler, Evan E.
    GENOME RESEARCH, 2014, 24 (04) : 688 - 696
  • [32] Long-Read Sequencing - A Powerful Toll in Viral Transcriptome Research
    Boldogkoi, Zsolt
    Moldovan, Norbert
    Balazs, Zsolt
    Snyder, Michael
    Tombacz, Ddra
    TRENDS IN MICROBIOLOGY, 2019, 27 (07) : 578 - 592
  • [33] Comparative Analysis of Short- and Long-Read Sequencing of Vancomycin-Resistant Enterococci for Application to Molecular Epidemiology
    Oh, Sujin
    Nam, Soo Kyung
    Chang, Ho Eun
    Park, Kyoung Un
    FRONTIERS IN CELLULAR AND INFECTION MICROBIOLOGY, 2022, 12
  • [34] Combining Short- and Long-Read Sequencing Technologies to Identify SARS-CoV-2 Variants in Wastewater
    Jayme, Gabrielle
    Liu, Ju-Ling
    Galvez, Jose Hector
    Reiling, Sarah Julia
    Celikkol, Sukriye
    N'Guessan, Arnaud
    Lee, Sally
    Chen, Shu-Huang
    Tsitouras, Alexandra
    Sanchez-Quete, Fernando
    Maere, Thomas
    Goitom, Eyerusalem
    Hachad, Mounia
    Mercier, Elisabeth
    Loeb, Stephanie Katharine
    Vanrolleghem, Peter A.
    Dorner, Sarah
    Delatolla, Robert
    Shapiro, B. Jesse
    Frigon, Dominic
    Ragoussis, Jiannis
    Snutch, Terrance P.
    VIRUSES-BASEL, 2024, 16 (09):
  • [35] Systematic benchmarking of tools for structural variation detection using short- and long-read sequencing data in pigs
    He, Sang
    Song, Bangmin
    Tang, Yueting
    Qu, Xiaolu
    Li, Xingzheng
    Yang, Xintong
    Bao, Qi
    Fang, Lingzhao
    Jiang, Jicai
    Tang, Zhonglin
    Yi, Guoqiang
    ISCIENCE, 2025, 28 (03)
  • [36] Short- and long-read metabarcoding of the eukaryotic rRNA operon: Evaluation of primers and comparison to shotgun metagenomics sequencing
    Latz, Meike A. C.
    Grujcic, Vesna
    Brugel, Sonia
    Lycken, Jenny
    John, Uwe
    Karlson, Bengt
    Andersson, Agneta
    Andersson, Anders F.
    MOLECULAR ECOLOGY RESOURCES, 2022, 22 (06) : 2304 - 2318
  • [37] Long-read sequencing of 111 rice genomes reveals significantly larger pan-genomes
    Zhang, Fan
    Xue, Hongzhang
    Dong, Xiaorui
    Li, Min
    Zheng, Xiaoming
    Li, Zhikang
    Xu, Jianlong
    Wang, Wensheng
    Wei, Chaochun
    GENOME RESEARCH, 2022, 32 (05) : 853 - 863
  • [38] Completion of draft bacterial genomes by long-read sequencing of synthetic genomic pools
    Hooman Derakhshani
    Steve P. Bernier
    Victoria A. Marko
    Michael G. Surette
    BMC Genomics, 21
  • [39] Highly accurate long-read HiFi sequencing data for five complex genomes
    Ting Hon
    Kristin Mars
    Greg Young
    Yu-Chih Tsai
    Joseph W. Karalius
    Jane M. Landolin
    Nicholas Maurer
    David Kudrna
    Michael A. Hardigan
    Cynthia C. Steiner
    Steven J. Knapp
    Doreen Ware
    Beth Shapiro
    Paul Peluso
    David R. Rank
    Scientific Data, 7
  • [40] Closing Clostridium botulinum Group III Genomes Using Long-Read Sequencing
    Woudstra, Cedric
    Maklin, Tommi
    Derman, Yagmur
    Bano, Luca
    Skarin, Hanna
    Mazuet, Christelle
    Honkela, Antti
    Lindstrom, Miia
    MICROBIOLOGY RESOURCE ANNOUNCEMENTS, 2021, 10 (22):