Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data

被引:4
|
作者
Hall, Michael B. [1 ]
Wick, Ryan R. [1 ,2 ]
Judd, Louise M. [1 ,2 ]
Nguyen, An N. [1 ]
Steinig, Eike J. [1 ]
Xie, Ouli [3 ,4 ]
Davies, Mark [1 ]
Seemann, Torsten [1 ,2 ]
Stinear, Timothy P. [1 ,2 ]
Coin, Lachlan [1 ]
机构
[1] Univ Melbourne, Peter Doherty Inst Infect & Immun, Dept Microbiol & Immunol, Melbourne, Australia
[2] Univ Melbourne, Ctr Pathogen Genom, Melbourne, Australia
[3] Univ Melbourne, Peter Doherty Inst Infect & Immun, Dept Infect Dis, Melbourne, Australia
[4] Monash Hlth, Monash Infect Dis, Melbourne, Vic, Australia
来源
ELIFE | 2024年 / 13卷
基金
英国医学研究理事会;
关键词
D O I
10.7554/eLife.98300
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Variant calling is fundamental in bacterial genomics, underpinning the identification of disease transmission clusters, the construction of phylogenetic trees, and antimicrobial resistance detection. This study presents a comprehensive benchmarking of variant calling accuracy in bacterial genomes using Oxford Nanopore Technologies (ONT) sequencing data. We evaluated three ONT basecalling models and both simplex (single-strand) and duplex (dual-strand) read types across 14 diverse bacterial species. Our findings reveal that deep learning-based variant callers, particularly Clair3 and DeepVariant, significantly outperform traditional methods and even exceed the accuracy of Illumina sequencing, especially when applied to ONT's super-high accuracy model. ONT's superior performance is attributed to its ability to overcome Illumina's errors, which often arise from difficulties in aligning reads in repetitive and variant-dense genomic regions. Moreover, the use of high-performing variant callers with ONT's super-high accuracy data mitigates ONT's traditional errors in homopolymers. We also investigated the impact of read depth on variant calling, demonstrating that 10x depth of ONT super-accuracy data can achieve precision and recall comparable to, or better than, full-depth Illumina sequencing. These results underscore the potential of ONT sequencing, combined with advanced variant calling algorithms, to replace traditional short-read sequencing methods in bacterial genomics, particularly in resource-limited settings.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] High-throughput deep learning variant effect prediction with Sequence UNET
    Dunham, Alistair S.
    Beltrao, Pedro
    AlQuraishi, Mohammed
    GENOME BIOLOGY, 2023, 24 (01)
  • [22] DeepMP: a deep learning tool to detect DNA base modifications on Nanopore sequencing data
    Bonet, Jose
    Chen, Mandi
    Dabad, Marc
    Heath, Simon
    Gonzalez-Perez, Abel
    Lopez-Bigas, Nuria
    Lagergren, Jens
    BIOINFORMATICS, 2022, 38 (05) : 1235 - 1243
  • [23] CONNET: Accurate Genome Consensus in Assembling Nanopore Sequencing Data via Deep Learning
    Zhang, Yifan
    Liu, Chi-Man
    Leung, Henry C. M.
    Luo, Ruibang
    Lam, Tak-Wah
    ISCIENCE, 2020, 23 (05)
  • [24] Deep learning identifies A-to-I RNA edits using nanopore sequencing data
    Aryee, Martin
    NATURE METHODS, 2022, 19 (07) : 797 - 798
  • [26] DEEP LEARNING REVEALS PREDICTIVE SEQUENCE CONCEPTS WITHIN IMMUNE REPERTOIRES TO IMMUNOTHERAPY
    Sidhom, John-William
    Sidhom, John-William
    Sidhom, John-William
    Ross-Macdonald, Petra
    Wind-Rotolo, Megan
    Pardoll, Andrew
    Baras, Alexander
    JOURNAL FOR IMMUNOTHERAPY OF CANCER, 2021, 9 : A872 - A873
  • [27] Deep learning reveals predictive sequence concepts within immune repertoires to immunotherapy
    Sidhom, John-William
    Oliveira, Giacomo
    Ross-MacDonald, Petra
    Wind-Rotolo, Megan
    Wu, Catherine J.
    Pardoll, Drew M.
    Baras, Alexander S.
    SCIENCE ADVANCES, 2022, 8 (37)
  • [28] An introduction to deep learning on biological sequence data: examples and solutions
    Jurtz, Vanessa Isabell
    Johansen, Alexander Rosenberg
    Nielsen, Morten
    Armenteros, Jose Juan Almagro
    Nielsen, Henrik
    Sonderby, Casper Kaae
    Winther, Ole
    Sonderby, Soren Kaae
    BIOINFORMATICS, 2017, 33 (22) : 3685 - 3690
  • [29] Deep Learning Encoding for Rapid Sequence Identification on Microbiome Data
    Borgman, Jacob
    Stark, Karen
    Carson, Jeremy
    Hauser, Loren
    FRONTIERS IN BIOINFORMATICS, 2022, 2
  • [30] BENCHMARKING DEEP LEARNING FRAMEWORKS FOR THE CLASSIFICATION OF VERY HIGH RESOLUTION SATELLITE MULTISPECTRAL DATA
    Papadomanolaki, M.
    Vakalopoulou, M.
    Zagoruyko, S.
    Karantzalos, K.
    XXIII ISPRS CONGRESS, COMMISSION VII, 2016, 3 (07): : 83 - 88