Correction of sequencing errors in a mixed set of reads

被引:71
|
作者
Salmela, Leena [1 ]
机构
[1] Univ Helsinki, Dept Comp Sci, FI-00014 Helsinki, Finland
基金
芬兰科学院;
关键词
D O I
10.1093/bioinformatics/btq151
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: High-throughput sequencing technologies produce large sets of short reads that may contain errors. These sequencing errors make de novo assembly challenging. Error correction aims to reduce the error rate prior assembly. Many de novo sequencing projects use reads from several sequencing technologies to get the benefits of all used technologies and to alleviate their shortcomings. However, combining such a mixed set of reads is problematic as many tools are specific to one sequencing platform. The SOLiD sequencing platform is especially problematic in this regard because of the two base color coding of the reads. Therefore, new tools for working with mixed read sets are needed. Results: We present an error correction tool for correcting substitutions, insertions and deletions in a mixed set of reads produced by various sequencing platforms. We first develop a method for correcting reads from any sequencing technology producing base space reads such as the SOLEXA/Illumina and Roche/454 Life Sciences sequencing platforms. We then further re. ne the algorithm to correct the color space reads from the Applied Biosystems SOLiD sequencing platform together with normal base space reads. Our new tool is based on the SHREC program that is aimed at correcting SOLEXA/Illumina reads. Our experiments show that we can detect errors with 99% sensitivity and >98% specificity if the combined sequencing coverage of the sets is at least 12. We also show that the error rate of the reads is greatly reduced. Availability: The JAVA source code is freely available at http://www.cs.helsinki../u/lmsalmel/hybrid-shrec/ Contact: leena.salmela@cs.helsinki.
引用
收藏
页码:1284 / 1290
页数:7
相关论文
共 50 条
  • [1] RACER: Rapid and accurate correction of errors in reads
    Ilie, Lucian
    Molnar, Michael
    BIOINFORMATICS, 2013, 29 (19) : 2490 - 2493
  • [2] Identification and correction of substitution errors in Moleculo long reads
    Price, Jared
    Ward, Judson
    Udall, Joshua
    Snell, Quinn
    Clement, Mark
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2013,
  • [3] SnakeLines: integrated set of computational pipelines for sequencing reads
    Budis, Jaroslav
    Krampl, Werner
    Kucharik, Marcel
    Hekel, Rastislav
    Goga, Adrian
    Sitarcik, Jozef
    Lichvar, Michal
    Smol'ak, David
    Boehmer, Miroslav
    Balaz, Andrej
    Duris, Frantisek
    Gazdarica, Juraj
    Soltys, Katarina
    Turna, Jan
    Radvanszky, Jan
    Szemes, Tomas
    JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2024, 20 (03)
  • [4] Jabba: hybrid error correction for long sequencing reads
    Giles Miclotte
    Mahdi Heydari
    Piet Demeester
    Stephane Rombauts
    Yves Van de Peer
    Pieter Audenaert
    Jan Fostier
    Algorithms for Molecular Biology, 11
  • [5] Jabba: hybrid error correction for long sequencing reads
    Miclotte, Giles
    Heydari, Mahdi
    Demeester, Piet
    Rombauts, Stephane
    Van de Peer, Yves
    Audenaert, Pieter
    Fostier, Jan
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2016, 11
  • [6] Hybrid-hybrid correction of errors in long reads with HERO
    Kang, Xiongbin
    Xu, Jialu
    Luo, Xiao
    Schoenhuth, Alexander
    GENOME BIOLOGY, 2023, 24 (01)
  • [7] Hybrid-hybrid correction of errors in long reads with HERO
    Xiongbin Kang
    Jialu Xu
    Xiao Luo
    Alexander Schönhuth
    Genome Biology, 24
  • [8] Error Correction and DeNovo Genome Assembly for the MinION Sequencing Reads mixing Illumina Short Reads
    Kchouk, Mehdi
    Elloumi, Mourad
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 1785 - 1785
  • [9] Trowel: a fast and accurate error correction module for Illumina sequencing reads
    Lim, Eun-Cheon
    Mueller, Jonas
    Hagmann, Joerg
    Henz, Stefan R.
    Kim, Sang-Tae
    Weigel, Detlef
    BIOINFORMATICS, 2014, 30 (22) : 3264 - 3265
  • [10] Repeat and haplotype aware error correction in nanopore sequencing reads with DeChat
    Liu, Yuansheng
    Li, Yichen
    Chen, Enlian
    Xu, Jialu
    Zhang, Wenhai
    Zeng, Xiangxiang
    Luo, Xiao
    COMMUNICATIONS BIOLOGY, 2024, 7 (01)