Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly

被引:32
|
作者
Holley, Guillaume [1 ]
Beyter, Doruk [1 ]
Ingimundardottir, Helga [1 ]
Moller, Peter L. [2 ]
Kristmundsdottir, Snodis [1 ,3 ]
Eggertsson, Hannes P. [1 ]
Halldorsson, Bjarni, V [1 ,3 ]
机构
[1] Amgen Inc, deCODE Genet, Reykjavik, Iceland
[2] Aarhus Univ, Dept Biomed, Aarhus, Denmark
[3] Reykjavik Univ, Sch Technol, Reykjavik, Iceland
关键词
GENOME; LIBRARY;
D O I
10.1186/s13059-020-02244-4
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low as 0.22 %. SNP calls in Ratatosk corrected reads are nearly 99 % accurate and indel calls accuracy is increased by up to 37 %. An assembly of Ratatosk corrected reads from an Ashkenazi individual yields a contig N50 of 45 Mbp and less misassemblies than a PacBio HiFi reads assembly.
引用
收藏
页数:22
相关论文
共 50 条
  • [31] Advancing long-read nanopore genome assembly and accurate variant calling for rare disease detection
    Negi, Shloka
    Stenton, Sarah L.
    Berger, Seth I.
    Canigiula, Paolo
    Mcnulty, Brandy
    Violich, Ivo
    Gardner, Joshua
    Hillaker, Todd
    O'Rourke, Sara M.
    O'Leary, Melanie C.
    Carbonell, Elizabeth
    Austin-Tse, Christina
    Lemire, Gabrielle
    Serrano, Jillian
    Mangilog, Brian
    Vannoy, Grace
    Kolmogorov, Mikhail
    Vilain, Eric
    O'Donnell-Luria, Anne
    Delot, Emmanuele
    Miga, Karen H.
    Monlong, Jean
    Paten, Benedict
    AMERICAN JOURNAL OF HUMAN GENETICS, 2025, 112 (02)
  • [32] Trowel: a fast and accurate error correction module for Illumina sequencing reads
    Lim, Eun-Cheon
    Mueller, Jonas
    Hagmann, Joerg
    Henz, Stefan R.
    Kim, Sang-Tae
    Weigel, Detlef
    BIOINFORMATICS, 2014, 30 (22) : 3264 - 3265
  • [33] The draft genome of MD-2 pineapple using hybrid error correction of long reads
    Redwan, Raimi M.
    Saidin, Akzam
    Kumar, S. Vijay
    DNA RESEARCH, 2016, 23 (05) : 427 - 439
  • [34] Hybrid-hybrid correction of errors in long reads with HERO
    Kang, Xiongbin
    Xu, Jialu
    Luo, Xiao
    Schoenhuth, Alexander
    GENOME BIOLOGY, 2023, 24 (01)
  • [35] Hybrid-hybrid correction of errors in long reads with HERO
    Xiongbin Kang
    Jialu Xu
    Xiao Luo
    Alexander Schönhuth
    Genome Biology, 24
  • [36] Error Correction and DeNovo Genome Assembly for the MinION Sequencing Reads mixing Illumina Short Reads
    Kchouk, Mehdi
    Elloumi, Mourad
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 1785 - 1785
  • [37] Error filtering, pair assembly and error correction for next-generation sequencing reads
    Edgar, Robert C.
    Flyvbjerg, Henrik
    BIOINFORMATICS, 2015, 31 (21) : 3476 - 3482
  • [38] HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads
    Antipov, Dmitry
    Korobeynikov, Anton
    McLean, Jeffrey S.
    Pevzner, Pavel A.
    BIOINFORMATICS, 2016, 32 (07) : 1009 - 1015
  • [39] Joint Analysis of Long and Short Reads Enables Accurate Estimates of Microbiome Complexity
    Bankevich, Anton
    Pevzner, Pavel A.
    CELL SYSTEMS, 2018, 7 (02) : 192 - +
  • [40] Comprehensive variant detection in a human genome with highly accurate long reads
    Rowell, W. J.
    Wenger, A. M.
    Kolesnikov, A.
    Chang, P.
    Carroll, A.
    Hall, R. J.
    Peluso, P.
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2019, 27 : 1723 - 1723