Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data

被引:51
|
作者
Kosugi, Shunichi [1 ,2 ]
Natsume, Satoshi [1 ]
Yoshida, Kentaro [1 ]
MacLean, Daniel [3 ]
Cano, Liliana [3 ]
Kamoun, Sophien [3 ]
Terauchi, Ryohei [1 ]
机构
[1] Iwate Biotechnol Res Ctr, Kitakami, Iwate, Japan
[2] Kazusa DNA Res Inst, Chiba, Japan
[3] Sainsbury Lab, Norwich, Norfolk, England
来源
PLOS ONE | 2013年 / 8卷 / 10期
关键词
END SHORT READS; SNP DISCOVERY; MOLECULAR-SPECTRUM; HUMAN GENOME; DNA; GENOTYPE; BREAKPOINTS; MUTATIONS; DELETIONS; COVERAGE;
D O I
10.1371/journal.pone.0075402
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in 'targeted' alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Review of alignment and SNP calling algorithms for next-generation sequencing data
    M. Mielczarek
    J. Szyda
    [J]. Journal of Applied Genetics, 2016, 57 : 71 - 79
  • [2] Review of alignment and SNP calling algorithms for next-generation sequencing data
    Mielczarek, M.
    Szyda, J.
    [J]. JOURNAL OF APPLIED GENETICS, 2016, 57 (01) : 71 - 79
  • [3] Validation and assessment of variant calling pipelines for next-generation sequencing
    Pirooznia, Mehdi
    Kramer, Melissa
    Parla, Jennifer
    Goes, Fernando S.
    Potash, James B.
    McCombie, W. Richard
    Zandi, Peter P.
    [J]. HUMAN GENOMICS, 2014, 8 : 14
  • [4] Validation and assessment of variant calling pipelines for next-generation sequencing
    Mehdi Pirooznia
    Melissa Kramer
    Jennifer Parla
    Fernando S Goes
    James B Potash
    W Richard McCombie
    Peter P Zandi
    [J]. Human Genomics, 8
  • [5] Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data
    Sarah Sandmann
    Aniek O. de Graaf
    Mohsen Karimi
    Bert A. van der Reijden
    Eva Hellström-Lindberg
    Joop H. Jansen
    Martin Dugas
    [J]. Scientific Reports, 7
  • [6] Empirical Bayes single nucleotide variant-calling for next-generation sequencing data
    Karimnezhad, Ali
    Perkins, Theodore J.
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01)
  • [7] A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data
    Xu, Chang
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2018, 16 : 15 - 24
  • [8] Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data
    Sandmann, Sarah
    de Graaf, Aniek O.
    Karimi, Mohsen
    van der Reijden, Bert A.
    Hellstrom-Lindberg, Eva
    Jansen, Joop H.
    Dugas, Martin
    [J]. SCIENTIFIC REPORTS, 2017, 7
  • [9] Empirical Bayes single nucleotide variant-calling for next-generation sequencing data
    Ali Karimnezhad
    Theodore J. Perkins
    [J]. Scientific Reports, 14
  • [10] SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data
    Wei, Zhi
    Wang, Wei
    Hu, Pingzhao
    Lyon, Gholson J.
    Hakonarson, Hakon
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 (19)