A survey of error-correction methods for next-generation sequencing

被引:171
|
作者
Yang, Xiao [1 ]
Chockalingam, Sriram P. [2 ]
Aluru, Srinivas [3 ]
机构
[1] Broad Inst, Cambridge Ctr 7, Genome Sequencing & Anal Program, Cambridge, MA 02142 USA
[2] Indian Inst Technol, Dept Comp Sci & Engn, Bombay, Maharashtra, India
[3] Iowa State Univ, Ames, IA 50011 USA
基金
美国国家科学基金会;
关键词
error correction; next-generation sequencing; sequence analysis; READ ALIGNMENT;
D O I
10.1093/bib/bbs015
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Error Correction is important for most next-generation sequencing applications because highly accurate sequenced reads will likely lead to higher quality results. Many techniques for error correction of sequencing data from next-gen platforms have been developed in the recent years. However, compared with the fast development of sequencing technologies, there is a lack of standardized evaluation procedure for different error-correction methods, making it difficult to assess their relative merits and demerits. In this article, we provide a comprehensive review of many error-correction methods, and establish a common set of benchmark data and evaluation criteria to provide a comparative assessment. We present experimental results on quality, run-time, memory usage and scalability of several error-correction methods. Apart from providing explicit recommendations useful to practitioners, the review serves to identify the current state of the art and promising directions for future research. Availability: All error-correction programs used in this article are downloaded from hosting websites. The evaluation tool kit is publicly available at: http://aluru-sun.ece.iastate.edu/doku.php?id=ecr.
引用
收藏
页码:56 / 66
页数:11
相关论文
共 50 条
  • [1] Benchmarking of computational error-correction methods for next-generation sequencing data
    Keith Mitchell
    Jaqueline J. Brito
    Igor Mandric
    Qiaozhen Wu
    Sergey Knyazev
    Sei Chang
    Lana S. Martin
    Aaron Karlsberg
    Ekaterina Gerasimov
    Russell Littman
    Brian L. Hill
    Nicholas C. Wu
    Harry Taegyun Yang
    Kevin Hsieh
    Linus Chen
    Eli Littman
    Taylor Shabani
    German Enik
    Douglas Yao
    Ren Sun
    Jan Schroeder
    Eleazar Eskin
    Alex Zelikovsky
    Pavel Skums
    Mihai Pop
    Serghei Mangul
    [J]. Genome Biology, 21
  • [2] Benchmarking of computational error-correction methods for next-generation sequencing data
    Mitchell, Keith
    Brito, Jaqueline J.
    Mandric, Igor
    Wu, Qiaozhen
    Knyazev, Sergey
    Chang, Sei
    Martin, Lana S.
    Karlsberg, Aaron
    Gerasimov, Ekaterina
    Littman, Russell
    Hill, Brian L.
    Wu, Nicholas C.
    Yang, Harry
    Hsieh, Kevin
    Chen, Linus
    Littman, Eli
    Shabani, Taylor
    Enik, German
    Yao, Douglas
    Sun, Ren
    Schroeder, Jan
    Eskin, Eleazar
    Zelikovsky, Alex
    Skums, Pavel
    Pop, Mihai
    Mangul, Serghei
    [J]. ACM-BCB 2020 - 11TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2020,
  • [3] Benchmarking of computational error-correction methods for next-generation sequencing data
    Mitchell, Keith
    Brito, Jaqueline J.
    Mandric, Igor
    Wu, Qiaozhen
    Knyazev, Sergey
    Chang, Sei
    Martin, Lana S.
    Karlsberg, Aaron
    Gerasimov, Ekaterina
    Littman, Russell
    Hill, Brian L.
    Wu, Nicholas C.
    Yang, Harry Taegyun
    Hsieh, Kevin
    Chen, Linus
    Littman, Eli
    Shabani, Taylor
    Enik, German
    Yao, Douglas
    Sun, Ren
    Schroeder, Jan
    Eskin, Eleazar
    Zelikovsky, Alex
    Skums, Pavel
    Pop, Mihai
    Mangul, Serghei
    [J]. GENOME BIOLOGY, 2020, 21 (01)
  • [4] Effects of error-correction of heterozygous next-generation sequencing data
    Fujimoto, M. Stanley
    Bodily, Paul M.
    Okuda, Nozomu
    Clement, Mark J.
    Snell, Quinn
    [J]. BMC BIOINFORMATICS, 2014, 15
  • [5] Effects of error-correction of heterozygous next-generation sequencing data
    M Stanley Fujimoto
    Paul M Bodily
    Nozomu Okuda
    Mark J Clement
    Quinn Snell
    [J]. BMC Bioinformatics, 15
  • [6] MapReduce for accurate error correction of next-generation sequencing data
    Zhao, Liang
    Chen, Qingfeng
    Li, Wencui
    Jiang, Peng
    Wong, Limsoon
    Li, Jinyan
    [J]. BIOINFORMATICS, 2017, 33 (23) : 3844 - 3851
  • [7] Efficient error correction for next-generation sequencing of viral amplicons
    Skums, Pavel
    Dimitrova, Zoya
    Campo, David S.
    Vaughan, Gilberto
    Rossi, Livia
    Forbi, Joseph C.
    Yokosawa, Jonny
    Zelikovsky, Alex
    Khudyakov, Yury
    [J]. BMC BIOINFORMATICS, 2012, 13
  • [8] Efficient error correction for next-generation sequencing of viral amplicons
    Pavel Skums
    Zoya Dimitrova
    David S Campo
    Gilberto Vaughan
    Livia Rossi
    Joseph C Forbi
    Jonny Yokosawa
    Alex Zelikovsky
    Yury Khudyakov
    [J]. BMC Bioinformatics, 13
  • [9] A systematic comparison of error correction enzymes by next-generation sequencing
    Lubock, Nathan B.
    Zhang, Di
    Sidore, Angus M.
    Church, George M.
    Kosuri, Sriram
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (15) : 9206 - 9217
  • [10] Error filtering, pair assembly and error correction for next-generation sequencing reads
    Edgar, Robert C.
    Flyvbjerg, Henrik
    [J]. BIOINFORMATICS, 2015, 31 (21) : 3476 - 3482