Optimisation of HMM Topologies Enhances DNA and Protein Sequence Modelling

被引:1
|
作者
Friedrich, Torben [1 ]
Koetschan, Christian [1 ]
Mueller, Tobias [1 ]
机构
[1] Univ Wurzburg, D-97070 Wurzburg, Germany
关键词
artificial intelligence; biostatistics; hidden Markov models; mathematical modelling; pattern recognition; statistics; HIDDEN MARKOV-MODELS; SECONDARY STRUCTURE; PREDICTION; ITS2; DATABASE;
D O I
10.2202/1544-6115.1480
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Hidden Markov models (HMMs) play a major role in applications to unravel biomolecular functionality. Though HMMs are technically mature and widely applied in computational biology, there is a potential of methodical optimisation concerning its modelling of biological data sources with varying sequence lengths. Single building blocks of these models, the states, are associated with a certain holding time, being the link to the length distribution of represented sequence motifs. An adaptation of regular HMM topologies to bell-shaped sequence lengths is achieved by a serial chain-linking of hidden states, while residing in the class of conventional hidden Markov models. The factor of the repetition of states (r) and the parameter for state-specific duration of stay (p) are determined by fitting the distribution of sequence lengths with the method of moments (MM) and maximum likelihood (ML). Performance evaluations of differently adjusted HMM topologies underline the impact of an optimisation for HMMs based on sequence lengths. Secondary structure prediction on internal transcribed spacer 2 sequences demonstrates exemplarily the general impact of topological optimisations. In summary, we propose a general methodology to improve the modelling behaviour of HMMs by topological optimisation with ML and a fast and easily implementable moment estimator.
引用
收藏
页数:27
相关论文
共 50 条
  • [21] Protein and DNA sequence determinants of thermophilic adaptation
    Zeldovich, Konstantin B.
    Berezovsky, Igor N.
    Shakhnovich, Eugene I.
    PLOS COMPUTATIONAL BIOLOGY, 2007, 3 (01) : 62 - 72
  • [22] PREDICTING RNA-PROTEIN BINDING REGIONS USING SEQUENCE AND STRUCTURAL SPECIFICITIES BASED ON HMM
    Lv, J. J.
    Ren, F. B.
    He, Y. H.
    Li, W.
    Chen, L. N.
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2015, 117 : 4 - 4
  • [23] Modelling the effect of structure and base sequence on DNA molecular electronics
    Ramos, M. M. D.
    Correia, H. M. G.
    NANOTECHNOLOGY, 2008, 19 (37)
  • [24] HMMBinder: DNA-Binding Protein Prediction Using HMM Profile Based Features
    Zaman, Rianon
    Chowdhury, Shahana Yasmin
    Rashid, Mahmood A.
    Sharma, Alok
    Dehzangi, Abdollah
    Shatabda, Swakkhar
    BIOMED RESEARCH INTERNATIONAL, 2017, 2017
  • [25] Investigating binding specificities of DNA binding protein/DNA sequence variants
    Wise, JG
    Fromknecht, K
    FASEB JOURNAL, 1997, 11 (09): : A1157 - A1157
  • [26] Dependence of DNA-protein cross-linking on DNA sequence
    King, ME
    Scala, J
    Grana, A
    Stemp, EDA
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2005, 229 : U477 - U477
  • [27] Dependence of DNA-protein cross-linking on DNA sequence
    Scala, Julianna
    Gonzalez, Graciela
    Burton, Sherry
    Norashkharyan, Christine
    Palma, Jessica
    Madison, Amanda
    Stemp, Eric D. A.
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2007, 233 : 743 - 743
  • [28] SEQUENCE DEPENDENT MODELS OF DNA BENDING IN DNA-PROTEIN COMPLEXES
    HAO, MH
    OLSON, WK
    BIOPHYSICAL JOURNAL, 1988, 53 (02) : A305 - A305
  • [29] Development, modelling, optimisation and scale-up of chromatographic purification of a therapeutic protein
    Mollerup, Jorgen M.
    Hansen, Thomas Budde
    Kidal, Steffen
    Sejergaard, Lars
    Staby, Ame
    FLUID PHASE EQUILIBRIA, 2007, 261 (1-2) : 133 - 139
  • [30] A novel protocol of energy optimisation for predicted protein structures built by homology modelling
    Xu, Tao
    Zhang, Lujia
    Wang, Xuedong
    Wei, Dongzhi
    MOLECULAR SIMULATION, 2010, 36 (13) : 1104 - 1109