Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly

被引:32
|
作者
Holley, Guillaume [1 ]
Beyter, Doruk [1 ]
Ingimundardottir, Helga [1 ]
Moller, Peter L. [2 ]
Kristmundsdottir, Snodis [1 ,3 ]
Eggertsson, Hannes P. [1 ]
Halldorsson, Bjarni, V [1 ,3 ]
机构
[1] Amgen Inc, deCODE Genet, Reykjavik, Iceland
[2] Aarhus Univ, Dept Biomed, Aarhus, Denmark
[3] Reykjavik Univ, Sch Technol, Reykjavik, Iceland
关键词
GENOME; LIBRARY;
D O I
10.1186/s13059-020-02244-4
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low as 0.22 %. SNP calls in Ratatosk corrected reads are nearly 99 % accurate and indel calls accuracy is increased by up to 37 %. An assembly of Ratatosk corrected reads from an Ashkenazi individual yields a contig N50 of 45 Mbp and less misassemblies than a PacBio HiFi reads assembly.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] Bi-Level Error Correction for PacBio Long Reads
    Liu, Yuansheng
    Lan, Chaowang
    Blumenstein, Michael
    Li, Jinyan
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (03) : 899 - 905
  • [42] Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads
    Song, Li
    Florea, Liliana
    GIGASCIENCE, 2015, 4
  • [43] Scrible: Ultra-Accurate Error-Correction of Pooled Sequenced Reads
    Duma, Denise
    Cordero, Francesca
    Beccuti, Marco
    Ciardo, Gianfranco
    Close, Timothy J.
    Lonardi, Stefano
    ALGORITHMS IN BIOINFORMATICS (WABI 2015), 2015, 9289 : 162 - 174
  • [44] VolcanoSV enables accurate and robust structural variant calling in diploid genomes from single-molecule long read sequencing
    Luo, Can
    Liu, Yichen Henry
    Zhou, Xin Maizie
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [45] Assembly of long, error-prone reads using repeat graphs
    Kolmogorov, Mikhail
    Yuan, Jeffrey
    Lin, Yu
    Pevzner, Pavel A.
    NATURE BIOTECHNOLOGY, 2019, 37 (05) : 540 - +
  • [46] Assembly of Long Error-Prone Reads Using Repeat Graphs
    Kolmogorov, Mikhail
    Yuan, Jeffrey
    Lin, Yu
    Pevzner, Pavel
    RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, RECOMB 2018, 2018, 10812 : 261 - 262
  • [47] Assembly of long, error-prone reads using repeat graphs
    Mikhail Kolmogorov
    Jeffrey Yuan
    Yu Lin
    Pavel A. Pevzner
    Nature Biotechnology, 2019, 37 : 540 - 546
  • [48] Performance difference of graph-based and alignment-based hybrid error correction methods for error-prone long reads
    Wang, Anqi
    Au, Kin Fai
    GENOME BIOLOGY, 2020, 21 (01)
  • [49] Local assembly of long reads enables phylogenomics of transposable elements in a polyploid cell line
    Han, Shunhua
    Dias, Guilherme B.
    Basting, Preston J.
    Viswanatha, Raghuvir
    Perrimon, Norbert
    Bergman, Casey M.
    NUCLEIC ACIDS RESEARCH, 2022, 50 (21) : E124
  • [50] Performance difference of graph-based and alignment-based hybrid error correction methods for error-prone long reads
    Anqi Wang
    Kin Fai Au
    Genome Biology, 21