HECIL: A Hybrid Error Correction Algorithm for Long Reads with Iterative Learning

被引:0
|
作者
Olivia Choudhury
Ankush Chakrabarty
Scott J. Emrich
机构
[1] Postdoctoral Researcher,Visiting Research Scientist
[2] IBM Research,Associate Professor, Department of Electrical Engineering and Computer Science
[3] Mitsubishi Electric Research Laboratories,undefined
[4] University of Tennessee,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Second-generation DNA sequencing techniques generate short reads that can result in fragmented genome assemblies. Third-generation sequencing platforms mitigate this limitation by producing longer reads that span across complex and repetitive regions. However, the usefulness of such long reads is limited because of high sequencing error rates. To exploit the full potential of these longer reads, it is imperative to correct the underlying errors. We propose HECIL—Hybrid Error Correction with Iterative Learning—a hybrid error correction framework that determines a correction policy for erroneous long reads, based on optimal combinations of decision weights obtained from short read alignments. We demonstrate that HECIL outperforms state-of-the-art error correction algorithms for an overwhelming majority of evaluation metrics on diverse, real-world data sets including E. coli, S. cerevisiae, and the malaria vector mosquito A. funestus. Additionally, we provide an optional avenue of improving the performance of HECIL’s core algorithm by introducing an iterative learning paradigm that enhances the correction policy at each iteration by incorporating knowledge gathered from previous iterations via data-driven confidence metrics assigned to prior corrections.
引用
收藏
相关论文
共 50 条
  • [41] Iterative learning control for robotic manipulators: A bounded-error algorithm
    Delchev, Kamen
    INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2014, 28 (12) : 1454 - 1473
  • [42] Iterative Error Correction with Double/Triple Error Detection
    Pfeifer, Petr
    Vierhaus, H. T.
    2016 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2016, : 14 - 19
  • [43] ELECTOR: evaluator for long reads correction methods
    Marchet, Camille
    Morisse, Pierre
    Lecompte, Lolita
    Lefebvre, Arnaud
    Lecroq, Thierry
    Peterlongo, Pierre
    Limasset, Antoine
    NAR GENOMICS AND BIOINFORMATICS, 2020, 2 (01)
  • [44] HALC: High throughput algorithm for long read error correction
    Ergude Bao
    Lingxiao Lan
    BMC Bioinformatics, 18
  • [45] HALC: High throughput algorithm for long read error correction
    Bao, Ergude
    Lan, Lingxiao
    BMC BIOINFORMATICS, 2017, 18
  • [46] Improvement in Convergence of Linear Learning Algorithm with Error Correction.
    Geppener, V.V.
    Kaftas'ev, V.N.
    Izvestia vyssih ucebnyh zavedenij. Priborostroenie, 1981, 24 (08): : 50 - 55
  • [47] Error Correction and DeNovo Genome Assembly for the MinION Sequencing Reads mixing Illumina Short Reads
    Kchouk, Mehdi
    Elloumi, Mourad
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 1785 - 1785
  • [48] Finding optimal threshold for correction error reads in DNA assembling
    Francis YL Chin
    Henry CM Leung
    Wei-Lin Li
    Siu-Ming Yiu
    BMC Bioinformatics, 10
  • [49] Finding optimal threshold for correction error reads in DNA assembling
    Chin, Francis Y. L.
    Leung, Henry C. M.
    Li, Wei-Lin
    Yiu, Siu-Ming
    BMC BIOINFORMATICS, 2009, 10
  • [50] A Non-Iterative Multiple Residue Digit Error Detection and Correction Algorithm in RRNS
    Tay, Thian Fatt
    Chang, Chip-Hong
    IEEE TRANSACTIONS ON COMPUTERS, 2016, 65 (02) : 396 - 408