Efficient Recovery of Complete Gut Viral Genomes by Combined Short- and Long-Read Sequencing

被引:0
|
作者
Chen, Jingchao [1 ]
Sun, Chuqing [1 ]
Dong, Yanqi [2 ,3 ]
Jin, Menglu [1 ,4 ]
Lai, Senying [2 ,3 ]
Jia, Longhao [2 ,3 ]
Zhao, Xueyang [4 ]
Wang, Huarui [1 ]
Gao, Na L. [1 ,5 ]
Bork, Peer [6 ,7 ,8 ,9 ]
Liu, Zhi [10 ]
Chen, Wei-Hua [1 ,4 ,11 ]
Zhao, Xing-Ming [2 ,3 ,12 ,13 ,14 ,15 ]
机构
[1] Huazhong Univ Sci & Technol, Hubei Key Lab Bioinformat & Mol Imaging, Ctr Artificial Intelligence Biol, Key Lab Mol Biophys,Minist Educ,Coll Life Sci & T, Wuhan 430074, Hubei, Peoples R China
[2] Fudan Univ, Zhongshan Hosp, Dept Neurol, Shanghai 200433, Peoples R China
[3] Fudan Univ, Inst Sci & Technol Brain Inspired Intelligence, Shanghai 200433, Peoples R China
[4] Henan Normal Univ, Coll Life Sci, Xinxiang 453007, Henan, Peoples R China
[5] Wuhan Univ, Dept Lab Med, Zhongnan Hosp, Wuhan 430071, Peoples R China
[6] Struct & Computat Biol Unit, European Mol Biol Lab, D-69117 Heidelberg, Germany
[7] Max Delbruck Ctr Mol Med, D-13125 Berlin, Germany
[8] Yonsei Univ, Yonsei Frontier Lab YFL, Seoul 03722, South Korea
[9] Univ Wurzburg, Dept Bioinformat, Bioctr, D-97070 Wurzburg, Germany
[10] Huazhong Univ Sci & Technol, Coll Life Sci & Technol, Dept Biotechnol, Wuhan 430074, Peoples R China
[11] Binzhou Med Univ, Inst Med Artificial Intelligence, Yantai 264003, Peoples R China
[12] Fudan Univ, MOE Key Lab Computat Neurosci & Brain Inspired In, Shanghai 200433, Peoples R China
[13] Fudan Univ, MOE Frontiers Ctr Brain Sci, Shanghai 200433, Peoples R China
[14] Fudan Univ, Inst Brain Sci, State Key Lab Med Neurobiol, Shanghai 200433, Peoples R China
[15] Int Human Phenome Inst Shanghai, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金;
关键词
crAssphage; gubaphage; gut virome; long-read sequencing; pacBio sequel II; terminase; virus-like particle; INTESTINAL MICROBIOME; SIGNATURE GENES; SINGLE-CELL; VIRUS; VIROME; DNA; BACTERIOPHAGES; IDENTIFICATION; ASSOCIATION; ALIGNMENT;
D O I
10.1002/advs.202305818
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Current metagenome assembled human gut phage catalogs contained mostly fragmented genomes. Here, comprehensive gut virome detection procedure is developed involving virus-like particle (VLP) enrichment from approximate to 500 g feces and combined sequencing of short- and long-read. Applied to 135 samples, a Chinese Gut Virome Catalog (CHGV) is assembled consisting of 21,499 non-redundant viral operational taxonomic units (vOTUs) that are significantly longer than those obtained by short-read sequencing and contained approximate to 35% (7675) complete genomes, which is approximate to nine times more than those in the Gut Virome Database (GVD, approximate to 4%, 1,443). Interestingly, the majority (approximate to 60%, 13,356) of the CHGV vOTUs are obtained by either long-read or hybrid assemblies, with little overlap with those assembled from only the short-read data. With this dataset, vast diversity of the gut virome is elucidated, including the identification of 32% (6,962) novel vOTUs compare to public gut virome databases, dozens of phages that are more prevalent than the crAssphages and/or Gubaphages, and several viral clades that are more diverse than the two. Finally, the functional capacities are also characterized of the CHGV encoded proteins and constructed a viral-host interaction network to facilitate future research and applications. The CHGV, a human gut virome database, comprises 21,499 non-redundant phage genomes obtaine through viral-like particle enrichment and combined short- and long-read sequencing. The long-reads facilitate the identification of more complete viral genomes and surprisingly favor lowly abundant ones. The CHGV features novel genomes, prevalent phages exceeding crAssphages/Gubaphages, and enables functional analysis and viral-host interaction network construction. This enhances gut virome research and applications significantly.image
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Long-read DNA methylation analysis of hepatoblastoma genomes using a nanopore sequencing
    Nagae, Genta
    Hiyama, Eiso
    Aburatani, Hiroyuki
    CANCER SCIENCE, 2023, 114 : 1686 - 1686
  • [42] Long-read sequencing for non-small-cell lung cancer genomes
    Sakamoto, Yoshitaka
    Xu, Liu
    Seki, Masahide
    Yokoyama, Toshiyuki T.
    Kasahara, Masahiro
    Kashima, Yukie
    Ohashi, Akihiro
    Shimada, Yoko
    Motoi, Noriko
    Tsuchihara, Katsuya
    Kobayashi, Susumu S.
    Kohno, Takashi
    Shiraishi, Yuichi
    Suzuki, Ayako
    Suzuki, Yutaka
    GENOME RESEARCH, 2020, 30 (09) : 1243 - 1257
  • [43] Highly accurate long-read HiFi sequencing data for five complex genomes
    Hon, Ting
    Mars, Kristin
    Young, Greg
    Tsai, Yu-Chih
    Karalius, Joseph W.
    Landolin, Jane M.
    Maurer, Nicholas
    Kudrna, David
    Hardigan, Michael A.
    Steiner, Cynthia C.
    Knapp, Steven J.
    Ware, Doreen
    Shapiro, Beth
    Peluso, Paul
    Rank, David R.
    SCIENTIFIC DATA, 2020, 7 (01)
  • [44] Tools and Strategies for Long-Read Sequencing and De Novo Assembly of Plant Genomes
    Jung, Hyungtaek
    Winefield, Christopher
    Bombarely, Aureliano
    Prentis, Peter
    Waterhouse, Peter
    TRENDS IN PLANT SCIENCE, 2019, 24 (08) : 700 - 724
  • [45] Integration of Hi-C with short and long-read genome sequencing reveals the structure of germline rearranged genomes
    Robert Schöpflin
    Uirá Souto Melo
    Hossein Moeinzadeh
    David Heller
    Verena Laupert
    Jakob Hertzberg
    Manuel Holtgrewe
    Nico Alavi
    Marius-Konstantin Klever
    Julius Jungnitsch
    Emel Comak
    Seval Türkmen
    Denise Horn
    Yannis Duffourd
    Laurence Faivre
    Patrick Callier
    Damien Sanlaville
    Orsetta Zuffardi
    Romano Tenconi
    Nehir Edibe Kurtas
    Sabrina Giglio
    Bettina Prager
    Anna Latos-Bielenska
    Ida Vogel
    Merete Bugge
    Niels Tommerup
    Malte Spielmann
    Antonio Vitobello
    Vera M. Kalscheuer
    Martin Vingron
    Stefan Mundlos
    Nature Communications, 13
  • [46] Integration of Hi-C with short and long-read genome sequencing reveals the structure of germline rearranged genomes
    Schoepflin, Robert
    Melo, Uira Souto
    Moeinzadeh, Hossein
    Heller, David
    Laupert, Verena
    Hertzberg, Jakob
    Holtgrewe, Manuel
    Alavi, Nico
    Klever, Marius-Konstantin
    Jungnitsch, Julius
    Comak, Emel
    Tuerkmen, Seval
    Horn, Denise
    Duffourd, Yannis
    Faivre, Laurence
    Callier, Patrick
    Sanlaville, Damien
    Zuffardi, Orsetta
    Tenconi, Romano
    Kurtas, Nehir Edibe
    Giglio, Sabrina
    Prager, Bettina
    Latos-Bielenska, Anna
    Vogel, Ida
    Bugge, Merete
    Tommerup, Niels
    Spielmann, Malte
    Vitobello, Antonio
    Kalscheuer, Vera M.
    Vingron, Martin
    Mundlos, Stefan
    NATURE COMMUNICATIONS, 2022, 13 (01)
  • [47] Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes
    De Maio, Nicola
    Shaw, Liam P.
    Hubbard, Alasdair
    George, Sophie
    Sanderson, Nicholas D.
    Swann, Jeremy
    Wick, Ryan
    AbuOun, Manal
    Stubberfield, Emma
    Hoosdally, Sarah J.
    Crook, Derrick W.
    Peto, Timothy E. A.
    Sheppard, Anna E.
    Bailey, Mark J.
    Read, Daniel S.
    Anjum, Muna F.
    Walker, A. Sarah
    Stoesser, Nicole
    Brett, H.
    Bowes, M.
    Chau, K.
    Duggett, N.
    Gilson, D.
    Gweon, H. S.
    Floosdally, S.
    Kavanaugh, J.
    Jones, H.
    Sebra, R.
    Smith, R.
    Swann, J.
    Woodford, N.
    MICROBIAL GENOMICS, 2019, 5 (09):
  • [48] Completion of draft bacterial genomes by long-read sequencing of synthetic genomic pools
    Derakhshani, Hooman
    Bernier, Steve P.
    Marko, Victoria A.
    Surette, Michael G.
    BMC GENOMICS, 2020, 21 (01)
  • [49] Application of long-read sequencing to the detection of structural variants in human cancer genomes
    Sakamoto, Yoshitaka
    Zaha, Suzuko
    Suzuki, Yutaka
    Seki, Masahide
    Suzuki, Ayako
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2021, 19 : 4207 - 4216
  • [50] Whole-genome sequencing of Ganoderma boninense, the causal agent of basal stem rot disease in oil palm, via combined short- and long-read sequencing
    Utomo, Condro
    Tanjung, Zulfikar Achmad
    Aditama, Redi
    Pratomo, Antonius Dony Madu
    Buana, Rika Fithri Nurani
    Putra, Hadi Septian Guna
    Tryono, Reno
    Liwang, Tony
    SCIENTIFIC REPORTS, 2024, 14 (01):