Advantages of long- and short-reads sequencing for the hybrid investigation of the Mycobacterium tuberculosis genome

被引:8
|
作者
Di Marco, Federico [1 ,2 ]
Spitaleri, Andrea [1 ,3 ]
Battaglia, Simone [1 ]
Batignani, Virginia [1 ]
Cabibbe, Andrea Maurizio [1 ]
Cirillo, Daniela Maria [1 ]
机构
[1] IRCCS San Raffaele Sci Inst, Emerging Bacterial Pathogens Unit, Milan, Italy
[2] Fdn Ctr San Raffaele, Milan, Italy
[3] Univ Vita Salute San Raffaele, Milan, Italy
关键词
next-generation sequencing; hybrid approach; long reads; drug resistance; Mycobacterium tuberculosis; transmission analysis; repetitive regions; RESISTANCE;
D O I
10.3389/fmicb.2023.1104456
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
IntroductionIn the fight to limit the global spread of antibiotic resistance, computational challenges associated with sequencing technology can impact the accuracy of downstream analysis, including drug resistance identification, transmission, and genome resolution. About 10% of Mycobacterium tuberculosis (MTB) genome is constituted by the PE/PPE family, a GC-rich repetitive genome region. Although sequencing using short read technology is widely used, it is well recognized its limit in the PE/PPE regions due to the unambiguously mapping process onto the reference genome. The aim of this study was to compare the performances of short-reads (SRS), long-reads (LRS) and hybrid-reads (HYBR) based analysis over different common investigative tasks: genome coverage estimation, variant calling and cluster analysis, drug resistance detection and de novo assembly. MethodsFor the study 13 model MTB clinical isolates were sequenced with both SRS and LRS. HYBR were produced correcting the long reads with the short reads. The fastq from the three approaches were then processed using a customized version of MTBseq for genome coverage estimation and variant calling and using two different assemblers for de novo assembly evaluation. ResultsEstimation of genome coverage performances showed lower 8X breadth coverage for SRS respect to LRS and HYBR: considering the PE/PPE genes, SRS showed low results for the PE_PGRS family, while obtained acceptable coverage in PE and PPE genes; LRS and HYBR reached optimal coverages in PE/PPE genes. For variant calling HYBR showed the highest resolution, detecting the highest percentage of uniquely identified mutations compared to LRS and SRS. All three approaches agreed on the identification of two major clusters, with HYBR identifying an higher number of SNPs between the two clusters. Comparing the quality of the assemblies, HYBR and LRS obtained better results than SRS. DiscussionIn conclusion, depending on the aim of the investigation, both SRS and LRS present complementary advantages and limitations implying that for a full resolution of MTB genomes, where all the mentioned analyses and both technologies are needed, the use of the HYBR approach represents a valid option and a well-rounded strategy.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Whole genome sequencing for drug resistance determination in Mycobacterium tuberculosis
    Omar, Shaheed, V
    Joseph, Lavania
    Said, Halima M.
    Ismail, Farzana
    Ismail, Nabila
    Gwala, Thabisile L.
    Ismail, Nazir A.
    AFRICAN JOURNAL OF LABORATORY MEDICINE, 2019, 8 (01)
  • [42] Whole-genome sequencing of Mycobacterium tuberculosis as an epidemiological marker
    van Soolingen, Dick
    LANCET RESPIRATORY MEDICINE, 2014, 2 (04): : 251 - 252
  • [43] Nanopore sequencing and assembly of a human genome with ultra-long reads
    Miten Jain
    Sergey Koren
    Karen H Miga
    Josh Quick
    Arthur C Rand
    Thomas A Sasani
    John R Tyson
    Andrew D Beggs
    Alexander T Dilthey
    Ian T Fiddes
    Sunir Malla
    Hannah Marriott
    Tom Nieto
    Justin O'Grady
    Hugh E Olsen
    Brent S Pedersen
    Arang Rhie
    Hollian Richardson
    Aaron R Quinlan
    Terrance P Snutch
    Louise Tee
    Benedict Paten
    Adam M Phillippy
    Jared T Simpson
    Nicholas J Loman
    Matthew Loose
    Nature Biotechnology, 2018, 36 : 338 - 345
  • [44] Nanopore sequencing and assembly of a human genome with ultra-long reads
    Jain, Miten
    Koren, Sergey
    Miga, Karen H.
    Quick, Josh
    Rand, Arthur C.
    Sasani, Thomas A.
    Tyson, John R.
    Beggs, Andrew D.
    Dilthey, Alexander T.
    Fiddes, Ian T.
    Malla, Sunir
    Marriott, Hannah
    Nieto, Tom
    O'Grady, Justin
    Olsen, Hugh E.
    Pedersen, Brent S.
    Rhie, Arang
    Richardson, Hollian
    Quinlan, Aaron R.
    Snutch, Terrance P.
    Tee, Louise
    Paten, Benedict
    Phillippy, Adam M.
    Simpson, Jared T.
    Loman, Nicholas J.
    Loose, Matthew
    NATURE BIOTECHNOLOGY, 2018, 36 (04) : 338 - +
  • [45] Benchmarking short-, long- and hybrid- read assemblers for metagenome sequencing of complex microbial communities
    Goussarov, Gleb
    Mysara, Mohamed
    Cleenwerck, Ilse
    Claesen, Jurgen
    Leys, Natalie
    Vandamme, Peter
    Van Houdt, Rob
    MICROBIOLOGY-SGM, 2024, 170 (06):
  • [46] Optimizing sequencing protocols for leaderboard metagenomics by combining long and short reads
    Jon G. Sanders
    Sergey Nurk
    Rodolfo A. Salido
    Jeremiah Minich
    Zhenjiang Z. Xu
    Qiyun Zhu
    Cameron Martino
    Marcus Fedarko
    Timothy D. Arthur
    Feng Chen
    Brigid S. Boland
    Greg C. Humphrey
    Caitriona Brennan
    Karenina Sanders
    James Gaffney
    Kristen Jepsen
    Mahdieh Khosroheidari
    Cliff Green
    Marlon Liyanage
    Jason W. Dang
    Vanessa V. Phelan
    Robert A. Quinn
    Anton Bankevich
    John T. Chang
    Tariq M. Rana
    Douglas J. Conrad
    William J. Sandborn
    Larry Smarr
    Pieter C. Dorrestein
    Pavel A. Pevzner
    Rob Knight
    Genome Biology, 20
  • [47] Sequencing synergy: integration of short and long reads for comprehensive pharmacogenetics testing
    Tellado, Sonia Font
    Brennan, Patrick
    Busse, Birgit
    Keskic, Leila
    Gentili, Sophie
    Lott, Steffen
    Wachter, Oliver
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 1698 - 1698
  • [48] Optimizing sequencing protocols for leaderboard metagenomics by combining long and short reads
    Sanders, Jon G.
    Nurk, Sergey
    Salido, Rodolfo A.
    Minich, Jeremiah
    Xu, Zhenjiang Z.
    Zhu, Qiyun
    Martino, Cameron
    Fedarko, Marcus
    Arthur, Timothy D.
    Chen, Feng
    Boland, Brigid S.
    Humphrey, Greg C.
    Brennan, Caitriona
    Sanders, Karenina
    Gaffney, James
    Jepsen, Kristen
    Khosroheidari, Mahdieh
    Green, Cliff
    Liyanage, Marlon
    Dang, Jason W.
    Phelan, Vanessa V.
    Quinn, Robert A.
    Bankevich, Anton
    Chang, John T.
    Rana, Tariq M.
    Conrad, Douglas J.
    Sandborn, William J.
    Smarr, Larry
    Dorrestein, Pieter C.
    Pevzner, Pavel A.
    Knight, Rob
    GENOME BIOLOGY, 2019, 20 (01) : 1 - 14
  • [49] SLHSD: hybrid scaffolding method based on short and long reads
    Luo, Junwei
    Guan, Ting
    Chen, Guolin
    Yu, Zhonghua
    Zhai, Haixia
    Yan, Chaokun
    Luo, Huimin
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (03)
  • [50] Detection of minor variants in Mycobacterium tuberculosis whole genome sequencing data
    Goossens, Sander N.
    Heupink, Tim H.
    De Vos, Elise
    Dippenaar, Anzaan
    De Vos, Margaretha
    Warren, Rob
    Van Rie, Annelies
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)