Bayesian basecalling for DNA sequence analysis using hidden Markov models

被引:0
|
作者
Liang, Kuo-ching [1 ]
Wang, Xiaodong [1 ]
Anastassiou, Dimitris [1 ]
机构
[1] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/CISS.2006.286391
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It has been shown that electropherograms of DNA sequences can be modelled with hidden Markov models. Base-calling, the procedure that determines the sequence of bases from the given eletropherogram, can then be performed using the Viterbi algorithm. A training step is required prior to basecalling in order to estimate the HMM parameters. In this paper, we propose a Bayesian approach which employs the Markov chain Monte Carlo (MCMC) method to perform basecalling. Such an approach not only allows one to naturally encode the prior biological knowledge into the basecalling algorithm, it also exploits both the training data and the basecalling data in estimating the HMM parameters, leading to more accurate estimates. Using the recently sequenced genome of the organism Legionella pneumophila we show that similar performance as the state-of-the-art basecalling algorithm in terms of total errors can be achieved even when a simple Gaussian model is assumed for the emission densities.
引用
收藏
页码:1599 / 1604
页数:6
相关论文
共 50 条
  • [1] Bayesian basecalling for DNA sequence analysis using hidden Markov models
    Liang, Kuo-ching
    Wang, Xiaodong
    Anastassiou, Dimitris
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2007, 4 (03) : 430 - 440
  • [2] Basecalling using hidden Markov models
    Boufounos, P
    El-Difrawy, S
    Ehrlich, D
    [J]. JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2004, 341 (1-2): : 23 - 36
  • [3] Bayesian hidden Markov model for DNA sequence segmentation: A prior sensitivity analysis
    Nur, Darfiana
    Allingham, David
    Rousseau, Judith
    Mengersen, Kerrie L.
    McVinish, Ross
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (05) : 1873 - 1882
  • [4] VARIATIONAL BAYESIAN ANALYSIS FOR HIDDEN MARKOV MODELS
    McGrory, C. A.
    Titterington, D. M.
    [J]. AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2009, 51 (02) : 227 - 244
  • [5] Computational Bayesian analysis of hidden Markov models
    Ryden, T
    Titterington, DM
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 1998, 7 (02) : 194 - 211
  • [6] Hidden Markov models in biological sequence analysis
    Birney, E
    [J]. IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2001, 45 (3-4) : 449 - 454
  • [7] Hidden Markov models and genome sequence analysis
    Eddy, SR
    [J]. FASEB JOURNAL, 1998, 12 (08): : A1327 - A1327
  • [8] State Sequence Analysis in Hidden Markov Models
    Grinberg, Yuri
    Perkins, Theodore J.
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2015, : 336 - 344
  • [9] Computational Bayesian analysis of hidden Markov mesh models
    Dunmur, AP
    Titterington, DM
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (11) : 1296 - 1300
  • [10] Bayesian hidden Markov models in DNA sequence segmentation using R: the case of Simian Vacuolating virus (SV40)
    Totterdell, James A.
    Nur, Darfiana
    Mengersen, Kerrie L.
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2017, 87 (14) : 2799 - 2827