Empirical Bayes single nucleotide variant-calling for next-generation sequencing data

被引:0
|
作者
Ali Karimnezhad
Theodore J. Perkins
机构
[1] University of Ottawa,Department of Mathematics and Statistics
[2] Health Products and Food Branch,Biostatistics and Risk Modelling Division, Bureau of Food Surveillance and Science Integration, Food Directorate
[3] Health Canada,Regenerative Medicine Program
[4] Ottawa Hospital Research Institute,Department of Biochemistry
[5] Microbiology and Immunology,undefined
[6] University of Ottawa,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
One of the fundamental computational problems in cancer genomics is the identification of single nucleotide variants (SNVs) from DNA sequencing data. Many statistical models and software implementations for SNV calling have been developed in the literature, yet, they still disagree widely on real datasets. Based on an empirical Bayesian approach, we introduce a local false discovery rate (LFDR) estimator for germline SNV calling. Our approach learns model parameters without prior information, and simultaneously accounts for information across all sites in the genomic regions of interest. We also propose another LFDR-based algorithm that reliably prioritizes a given list of mutations called by any other variant-calling algorithm. We use a suite of gold-standard cell line data to compare our LFDR approach against a collection of widely used, state of the art programs. We find that our LFDR approach approximately matches or exceeds the performance of all of these programs, despite some very large differences among them. Furthermore, when prioritizing other algorithms’ calls by our LFDR score, we find that by manipulating the type I-type II tradeoff we can select subsets of variant calls with minimal loss of sensitivity but dramatic increases in precision.
引用
收藏
相关论文
共 50 条
  • [1] Empirical Bayes single nucleotide variant-calling for next-generation sequencing data
    Karimnezhad, Ali
    Perkins, Theodore J.
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01)
  • [2] A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data
    Xu, Chang
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2018, 16 : 15 - 24
  • [3] Implementation of standardized variant-calling nomenclature in the age of next-generation sequencing: where do we stand?
    Ann-Kathrin Eisfeld
    James S. Blachly
    Krzysztof Mrózek
    Jessica Kohlschmidt
    Christopher J. Walker
    Albert de la Chapelle
    Clara D. Bloomfield
    [J]. Leukemia, 2019, 33 : 809 - 810
  • [4] Implementation of standardized variant-calling nomenclature in the age of next-generation sequencing: where do we stand?
    Eisfeld, Ann-Kathrin
    Blachly, James S.
    Mrozek, Krzysztof
    Kohlschmidt, Jessica
    Walker, Christopher J.
    de la Chapelle, Albert
    Bloomfield, Clara D.
    [J]. LEUKEMIA, 2019, 33 (03) : 809 - 810
  • [5] Validation and assessment of variant calling pipelines for next-generation sequencing
    Pirooznia, Mehdi
    Kramer, Melissa
    Parla, Jennifer
    Goes, Fernando S.
    Potash, James B.
    McCombie, W. Richard
    Zandi, Peter P.
    [J]. HUMAN GENOMICS, 2014, 8 : 14
  • [6] Validation and assessment of variant calling pipelines for next-generation sequencing
    Mehdi Pirooznia
    Melissa Kramer
    Jennifer Parla
    Fernando S Goes
    James B Potash
    W Richard McCombie
    Peter P Zandi
    [J]. Human Genomics, 8
  • [7] Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data
    Sarah Sandmann
    Aniek O. de Graaf
    Mohsen Karimi
    Bert A. van der Reijden
    Eva Hellström-Lindberg
    Joop H. Jansen
    Martin Dugas
    [J]. Scientific Reports, 7
  • [8] Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data
    Kosugi, Shunichi
    Natsume, Satoshi
    Yoshida, Kentaro
    MacLean, Daniel
    Cano, Liliana
    Kamoun, Sophien
    Terauchi, Ryohei
    [J]. PLOS ONE, 2013, 8 (10):
  • [9] Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data
    Sandmann, Sarah
    de Graaf, Aniek O.
    Karimi, Mohsen
    van der Reijden, Bert A.
    Hellstrom-Lindberg, Eva
    Jansen, Joop H.
    Dugas, Martin
    [J]. SCIENTIFIC REPORTS, 2017, 7
  • [10] SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data
    Wei, Zhi
    Wang, Wei
    Hu, Pingzhao
    Lyon, Gholson J.
    Hakonarson, Hakon
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 (19)