HECIL: A Hybrid Error Correction Algorithm for Long Reads with Iterative Learning

被引:0
|
作者
Olivia Choudhury
Ankush Chakrabarty
Scott J. Emrich
机构
[1] Postdoctoral Researcher,Visiting Research Scientist
[2] IBM Research,Associate Professor, Department of Electrical Engineering and Computer Science
[3] Mitsubishi Electric Research Laboratories,undefined
[4] University of Tennessee,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Second-generation DNA sequencing techniques generate short reads that can result in fragmented genome assemblies. Third-generation sequencing platforms mitigate this limitation by producing longer reads that span across complex and repetitive regions. However, the usefulness of such long reads is limited because of high sequencing error rates. To exploit the full potential of these longer reads, it is imperative to correct the underlying errors. We propose HECIL—Hybrid Error Correction with Iterative Learning—a hybrid error correction framework that determines a correction policy for erroneous long reads, based on optimal combinations of decision weights obtained from short read alignments. We demonstrate that HECIL outperforms state-of-the-art error correction algorithms for an overwhelming majority of evaluation metrics on diverse, real-world data sets including E. coli, S. cerevisiae, and the malaria vector mosquito A. funestus. Additionally, we provide an optional avenue of improving the performance of HECIL’s core algorithm by introducing an iterative learning paradigm that enhances the correction policy at each iteration by incorporating knowledge gathered from previous iterations via data-driven confidence metrics assigned to prior corrections.
引用
收藏
相关论文
共 50 条
  • [31] LCAT: an isoform-sensitive error correction for transcriptome sequencing long reads
    Zhu, Wufei
    Liao, Xingyu
    FRONTIERS IN GENETICS, 2023, 14
  • [32] Iterative blending algorithm based on error correction of sub image
    Xing, Gui-Hua
    Yu, Sheng-Lin
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2007, 35 (01): : 53 - 57
  • [33] Correction to: SLR: a scaffolding algorithm based on long reads and contig classification
    Junwei Luo
    Mengna Lyu
    Ranran Chen
    Xiaohong Zhang
    Huimin Luo
    Chaokun Yan
    BMC Bioinformatics, 21
  • [34] Hybrid error correction and de novo assembly of single-molecule sequencing reads
    Sergey Koren
    Michael C Schatz
    Brian P Walenz
    Jeffrey Martin
    Jason T Howard
    Ganeshkumar Ganapathy
    Zhong Wang
    David A Rasko
    W Richard McCombie
    Erich D Jarvis
    Adam M Phillippy
    Nature Biotechnology, 2012, 30 : 693 - 700
  • [35] Hybrid error correction and de novo assembly of single-molecule sequencing reads
    Koren, Sergey
    Schatz, Michael C.
    Walenz, Brian P.
    Martin, Jeffrey
    Howard, Jason T.
    Ganapathy, Ganeshkumar
    Wang, Zhong
    Rasko, David A.
    McCombie, W. Richard
    Jarvis, Erich D.
    Phillippy, Adam M.
    NATURE BIOTECHNOLOGY, 2012, 30 (07) : 692 - +
  • [36] An Error Correction and DeNovo Assembly Approach for Nanopore Reads Using Short Reads
    Kchouk, Mehdi
    Elloumi, Mourad
    CURRENT BIOINFORMATICS, 2018, 13 (03) : 241 - 252
  • [37] A New Iterative Learning Control Algorithm for Final Error Reduction
    Chen, Zhu
    Liang, Xiao
    Zheng, Minghui
    IFAC PAPERSONLINE, 2022, 55 (37): : 788 - 794
  • [38] Computer Simulation of Bounded Error Algorithm for Iterative Learning Control
    Yovchev, Kaloyan
    Delchev, Kamen
    Krastev, Evgeniy
    ADVANCES IN ROBOT DESIGN AND INTELLIGENT CONTROL, 2017, 540 : 136 - 143
  • [39] Hybrid location algorithm for the acoustic source based on error correction
    Qi X.
    Yuan L.
    Liu L.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (01): : 1 - 7
  • [40] Iterative Learning Control for Nonlinear Systems: A Bounded-Error Algorithm
    Delchev, Kamen
    ASIAN JOURNAL OF CONTROL, 2013, 15 (02) : 453 - 460