Hybrid correction of highly noisy long reads using a variable-order de Bruijn graph

被引:21
|
作者
Morisse, Pierre [1 ]
Lecroq, Thierry [1 ]
Lefebvre, Arnaud [1 ]
机构
[1] Normandie Univ, UNIROUEN, LITIS, F-76000 Rouen, France
关键词
ERROR-CORRECTION; LARGE GENOMES; ACCURATE;
D O I
10.1093/bioinformatics/bty521
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The recent rise of long read sequencing technologies such as Pacific Biosciences and Oxford Nanopore allows to solve assembly problems for larger and more complex genomes than what allowed short reads technologies. However, these long reads are very noisy, reaching an error rate of around 10-15% for Pacific Biosciences, and up to 30% for Oxford Nanopore. The error correction problem has been tackled by either self-correcting the long reads, or using complementary short reads in a hybrid approach. However, even though sequencing technologies promise to lower the error rate of the long reads below 10%, it is still higher in practice, and correcting such noisy long reads remains an issue. Results: We present HG-CoLoR, a hybrid error correction method that focuses on a seed-andextend approach based on the alignment of the short reads to the long reads, followed by the traversal of a variable-order de Bruijn graph, built from the short reads. Our experiments show that HG-CoLoR manages to efficiently correct highly noisy long reads that display an error rate as high as 44%. When compared to other state-of-the-art long read error correction methods, our experiments also show that HG-CoLoR provides the best trade-off between runtime and quality of the results, and is the only method able to efficiently scale to eukaryotic genomes.
引用
收藏
页码:4213 / 4222
页数:10
相关论文
共 21 条
  • [1] Variable-Order de Bruijn Graphs
    Boucher, Christina
    Bowe, Alex
    Gagie, Travis
    Puglisi, Simon J.
    Sadakane, Kunihiko
    2015 DATA COMPRESSION CONFERENCE (DCC), 2015, : 383 - 392
  • [2] Bidirectional Variable-Order de Bruijn Graphs
    Belazzougui, Djamal
    Gagie, Travis
    Makinen, Veli
    Previtali, Marco
    Puglisi, Simon J.
    INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2018, 29 (08) : 1279 - 1295
  • [3] Accurate self-correction of errors in long reads using de Bruijn graphs
    Salmela, Leena
    Walve, Riku
    Rivals, Eric
    Ukkonen, Esko
    BIOINFORMATICS, 2017, 33 (06) : 799 - 806
  • [4] cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs
    Tolstoganov, Ivan
    Bankevich, Anton
    Chen, Zhoutao
    Pevzner, Pavel A.
    BIOINFORMATICS, 2019, 35 (14) : I61 - I70
  • [5] Assembly of long error-prone reads using de Bruijn graphs
    Lin, Yu
    Yuan, Jeffrey
    Kolmogorov, Mikhail
    Shen, Max W.
    Chaisson, Mark
    Pevzner, Pavel A.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (52) : E8396 - E8405
  • [6] Konnector: Connecting Paired-end Reads Using a Bloom Filter de Bruijn Graph
    Vandervalk, Benjamin P.
    Jackman, Shaun D.
    Raymond, Anthony
    Mohamadi, Hamid
    Yang, Chen
    Attali, Dean A.
    Chu, Justin
    Warren, Rene L.
    Birol, Inanc
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [7] De novo diploid genome assembly using long noisy reads
    Nie, Fan
    Ni, Peng
    Huang, Neng
    Zhang, Jun
    Wang, Zhenyu
    Xiao, Chuanle
    Luo, Feng
    Wang, Jianxin
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [8] De novo diploid genome assembly using long noisy reads
    Fan Nie
    Peng Ni
    Neng Huang
    Jun Zhang
    Zhenyu Wang
    Chuanle Xiao
    Feng Luo
    Jianxin Wang
    Nature Communications, 15
  • [9] Efficient Hybrid De Novo Error Correction and Assembly for Long Reads
    Kchouk, Mehdi
    Elloumi, Mourad
    2016 27TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2016, : 88 - 92
  • [10] Efficient Distributed Parallel Aligning Reads and Reference Genome with Many Repetitive Subsequences Using Compact de Bruijn Graph
    Li, Yao
    Zhong, Cheng
    Chen, Danyang
    Zhang, Jinxiong
    Yin, Mengxiao
    PAAP 2021: 2021 12TH INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING, 2021, : 24 - 28