naiveBayesCall: An Efficient Model-Based Base-Calling Algorithm for High-Throughput Sequencing

被引:0
|
作者
Kao, Wei-Chun [1 ]
Song, Yuri S. [1 ]
机构
[1] Univ Calif Berkeley, Div Comp Sci, Berkeley, CA 94720 USA
关键词
GENOME; MATRIX;
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Immense amounts of raw instrument data (i.e., images of fluorescence) are currently being generated using ultra high-throughput sequencing platforms. An important computational challenge associated with this rapid advancement is to develop efficient algorithms that can extract accurate sequence information from raw data. To address this challenge, we recently introduced a novel model-based base-calling algorithm that is fully parametric and has several advantages over previously proposed methods. Our original algorithm, called BayesCall, significantly reduced the error rate, particularly in the later cycles of a sequencing run, and also produced useful base-specific quality scores with a high discrimination ability. Unfortunately, however, BayesCall is too computationally expensive to be of broad practical use. In this paper, we build on our previous model-based approach to devise an efficient base-calling algorithm that is orders of magnitude faster than BayesCall, while still maintaining a comparably high level of accuracy. Our new algorithm is called naiveBayesCall, and it utilizes approximation and optimization methods to achieve scalability. We describe the performance of naiveBayesCall and demonstrate how improved base-calling accuracy may facilitate de novo assembly when the coverage is low to moderate.
引用
收藏
页码:233 / 247
页数:15
相关论文
共 50 条
  • [41] High-throughput sequencing reveals a simple model of nucleosome energetics
    Locke, George
    Tolkunov, Denis
    Moqtaderi, Zarmik
    Struhl, Kevin
    Morozov, Alexandre V.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2010, 107 (49) : 20998 - 21003
  • [42] Model-Based Design and Experimental Evaluation of a High-Throughput Electrode Feeding and Stacking Process
    von Boeselager, Christina
    Mueller, Alexander
    Toenjes, Leif
    Shi, Xuenan
    Wassenberg, Dominik
    Evans, Daniel
    Glodde, Arne
    Dietrich, Franz
    Droeder, Klaus
    [J]. ENERGY TECHNOLOGY, 2023, 11 (05)
  • [43] A model of somatic hypermutation targeting in mice based on high-throughput immunoglobulin sequencing data
    Cui, Ang
    Diniro, Roberto
    Briggs, Adrian
    Adams, Kris
    Heiden, Jason Vander
    O'Connor, Kevin
    Vigneault, Francois
    Shlomchik, Mark
    Kleinstein, Steven
    [J]. JOURNAL OF IMMUNOLOGY, 2015, 194
  • [44] A Model of Somatic Hypermutation Targeting in Mice Based on High-Throughput Ig Sequencing Data
    Cui, Ang
    Di Niro, Roberto
    Vander Heiden, Jason A.
    Briggs, Adrian W.
    Adams, Kris
    Gilbert, Tamara
    O'Connor, Kevin C.
    Vigneault, Francois
    Shlomchik, Mark J.
    Kleinstein, Steven H.
    [J]. JOURNAL OF IMMUNOLOGY, 2016, 197 (09): : 3566 - 3574
  • [45] SIMPLE 2-COLOR BASE-CALLING SCHEMES FOR DNA-SEQUENCING BASED ON STANDARD 4-LABEL SANGER CHEMISTRY
    LI, QB
    YEUNG, ES
    [J]. APPLIED SPECTROSCOPY, 1995, 49 (10) : 1528 - 1533
  • [46] Transcriptome analysis of tongue cancer based on high-throughput sequencing
    Tang, Mingming
    Dai, Wencheng
    Wu, Hao
    Xu, Xinjiang
    Jiang, Bin
    Wei, Yingze
    Qian, Hongyan
    Han, Liang
    [J]. ONCOLOGY REPORTS, 2020, 43 (06) : 2004 - 2016
  • [47] mirTools: microRNA profiling and discovery based on high-throughput sequencing
    Zhu, Erle
    Zhao, Fangqing
    Xu, Gang
    Hou, Huabin
    Zhou, LingLin
    Li, Xiaokun
    Sun, Zhongsheng
    Wu, Jinyu
    [J]. NUCLEIC ACIDS RESEARCH, 2010, 38 : W392 - W397
  • [48] Accurate Diagnostics for Bovine tuberculosis Based on High-Throughput Sequencing
    Churbanov, Alexander
    Milligan, Brook
    [J]. PLOS ONE, 2012, 7 (11):
  • [49] Manganese citrate improves base-calling accuracy in DNA sequencing reactions using rhodamine-based fluorescent dye-terminators
    Korch, C
    Drabkin, H
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (05) : 1405 - 1407
  • [50] Cost-efficient high-throughput HLA typing by MiSeq amplicon sequencing
    Lange, Vinzenz
    Boehme, Irina
    Hofmann, Jan
    Lang, Kathrin
    Sauter, Juergen
    Schoene, Bianca
    Paul, Patrick
    Albrecht, Viviane
    Andreas, Johanna M.
    Baier, Daniel M.
    Nething, Jochen
    Ehninger, Ulf
    Schwarzelt, Carmen
    Pingel, Julia
    Ehninger, Gerhard
    Schmidt, Alexander H.
    [J]. BMC GENOMICS, 2014, 15