Correction of sequencing errors in a mixed set of reads

被引:71
|
作者
Salmela, Leena [1 ]
机构
[1] Univ Helsinki, Dept Comp Sci, FI-00014 Helsinki, Finland
基金
芬兰科学院;
关键词
D O I
10.1093/bioinformatics/btq151
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: High-throughput sequencing technologies produce large sets of short reads that may contain errors. These sequencing errors make de novo assembly challenging. Error correction aims to reduce the error rate prior assembly. Many de novo sequencing projects use reads from several sequencing technologies to get the benefits of all used technologies and to alleviate their shortcomings. However, combining such a mixed set of reads is problematic as many tools are specific to one sequencing platform. The SOLiD sequencing platform is especially problematic in this regard because of the two base color coding of the reads. Therefore, new tools for working with mixed read sets are needed. Results: We present an error correction tool for correcting substitutions, insertions and deletions in a mixed set of reads produced by various sequencing platforms. We first develop a method for correcting reads from any sequencing technology producing base space reads such as the SOLEXA/Illumina and Roche/454 Life Sciences sequencing platforms. We then further re. ne the algorithm to correct the color space reads from the Applied Biosystems SOLiD sequencing platform together with normal base space reads. Our new tool is based on the SHREC program that is aimed at correcting SOLEXA/Illumina reads. Our experiments show that we can detect errors with 99% sensitivity and >98% specificity if the combined sequencing coverage of the sets is at least 12. We also show that the error rate of the reads is greatly reduced. Availability: The JAVA source code is freely available at http://www.cs.helsinki../u/lmsalmel/hybrid-shrec/ Contact: leena.salmela@cs.helsinki.
引用
收藏
页码:1284 / 1290
页数:7
相关论文
共 50 条
  • [31] NGmerge: merging paired-end reads via novel empirically-derived models of sequencing errors
    John M. Gaspar
    BMC Bioinformatics, 19
  • [32] Haplotype Estimation Using Sequencing Reads
    Delaneau, Olivier
    Howie, Bryan
    Cox, Anthony J.
    Zagury, Jean-Francois
    Marchini, Jonathan
    AMERICAN JOURNAL OF HUMAN GENETICS, 2013, 93 (04) : 687 - 696
  • [33] DelInsCaller: An Efficient Algorithm for Identifying Delins and Estimating Haplotypes from Long Reads with High Level of Sequencing Errors
    Wang, Shenjie
    Zhang, Xuanping
    Qiang, Geng
    Wang, Jiayin
    GENES, 2023, 14 (01)
  • [34] NGmerge: merging paired-end reads via novel empirically-derived models of sequencing errors
    Gaspar, John M.
    BMC BIOINFORMATICS, 2018, 19
  • [35] Circular consensus sequencing with long reads
    Lei Tang
    Nature Methods, 2019, 16 : 958 - 958
  • [36] Correcting errors in short reads by multiple alignments
    Salmela, Leena
    Schroeder, Jan
    BIOINFORMATICS, 2011, 27 (11) : 1455 - 1461
  • [37] Author Correction: Rapid de novo assembly of the European eel genome from nanopore sequencing reads
    Hans J. Jansen
    Michael Liem
    Susanne A. Jong-Raadsen
    Sylvie Dufour
    Finn-Arne Weltzien
    William Swinkels
    Alex Koelewijn
    Arjan P. Palstra
    Bernd Pelster
    Herman P. Spaink
    Guido E. van den Thillart
    Ron P. Dirks
    Christiaan V. Henkel
    Scientific Reports, 9
  • [38] MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads
    Chuan-Le Xiao
    Ying Chen
    Shang-Qian Xie
    Kai-Ning Chen
    Yan Wang
    Yue Han
    Feng Luo
    Zhi Xie
    Nature Methods, 2017, 14 : 1072 - 1074
  • [39] BLESS: Bloom filter-based error correction solution for high-throughput sequencing reads
    Heo, Yun
    Wu, Xiao-Long
    Chen, Deming
    Ma, Jian
    Hwu, Wen-Mei
    BIOINFORMATICS, 2014, 30 (10) : 1354 - 1362
  • [40] MECAT : fast mapping, error correction, and de novo assembly for single-molecule sequencing reads
    Xiao, Chuan-Le
    Chen, Ying
    Xie, Shang-Qian
    Chen, Kai-Ning
    Wang, Yan
    Han, Yue
    Luo, Feng
    Xie, Zhi
    NATURE METHODS, 2017, 14 (11) : 1072 - +