Clustering ionic flow blockade toggles with a Mixture of HMMs

被引:10
|
作者
Churbanov, Alexander [1 ]
Winters-Hilt, Stephen [1 ,2 ]
机构
[1] Res Inst Children, New Orleans, LA 70118 USA
[2] Univ New Orleans, Dept Comp Sci, New Orleans, LA 70148 USA
关键词
D O I
10.1186/1471-2105-9-S9-S13
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Ionic current blockade signal processing, for use in nanopore detection, offers a promising new way to analyze single molecule properties with potential implications for DNA sequencing. The alpha-Hemolysin transmembrane channel interacts with a translocating molecule in a nontrivial way, frequently evidenced by a complex ionic flow blockade pattern with readily distinguishable modes of toggling. Effective processing of such signals requires developing machine learning methods capable of learning the various blockade modes for classification and knowledge discovery purposes. Here we propose a method aimed to improve our stochastic analysis capabilities to better understand the discriminatory capabilities of the observed the nanopore channel interactions with analyte. Results: We tailored our memory-sparse distributed implementation of a Mixture of Hidden Markov Models (MHMMs) to the problem of channel current blockade clustering and associated analyte classification. By using probabilistic fully connected HMM profiles as mixture components we were able to cluster the various 9 base-pair hairpin channel blockades. We obtained very high Maximum a Posteriori (MAP) classification with a mixture of 12 different channel blockade profiles, each with 4 levels, a configuration that can be computed with sufficient speed for real-time experimental feedback. MAP classification performance depends on several factors such as the number of mixture components, the number of levels in each profile, and the duration of a channel blockade event. We distribute Baum-Welch Expectation Maximization (EM) algorithms running on our model in two ways. A distributed implementation of the MHMM data processing accelerates data clustering efforts. The second, simultanteous, strategy uses an EM checkpointing algorithm to lower the memory use and efficiently distribute the bulk of EM processing in processing large data sequences (such as for the progressive sums used in the HMM parameter estimates). Conclusion: The proposed distributed MHMM method has many appealing properties, such as precise classification of analyte in real-time scenarios, and the ability to incorporate new domain knowledge into a flexible, easily distributable, architecture. The distributed HMM provides a feature extraction that is equivalent to that of the sequential HMM with a speedup factor approximately equal to the number of independent CPUs operating on the data. The MHMM topology learns clusters existing within data samples via distributed HMM EM learning. A Java implementation of the MHMM algorithm is available at http://logos.cs.uno.edu/similar to achurban.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Clustering ionic flow blockade toggles with a Mixture of HMMs
    Alexander Churbanov
    Stephen Winters-Hilt
    [J]. BMC Bioinformatics, 9
  • [2] Tree-based clustering for Gaussian mixture HMMs
    Kato, Tsuneo
    Kuroiwa, Shingo
    Shimizu, Tohru
    Higuchi, Norio
    [J]. Systems and Computers in Japan, 2002, 33 (04) : 40 - 49
  • [3] A Generative Time Series Clustering Framework Based on an Ensemble Mixture of HMMs
    Kanaan, Mohamad
    Benabdeslem, Khalid
    Kheddouci, Hamamache
    [J]. 2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 793 - 798
  • [4] Noisy Speech Recognition by using Output Combination of Discrete-Mixture HMMs and Continuous-Mixture HMMs
    Kosaka, Tetsuo
    Saito, You
    Kato, Masaharu
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2355 - 2358
  • [5] Boosted Mixture Learning of Gaussian Mixture HMMs for Speech Recognition
    Du, Jun
    Hu, Yu
    Jiang, Hui
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2942 - +
  • [6] Training mixture density HMMs with SOM and LVQ
    Kurimo, M
    [J]. COMPUTER SPEECH AND LANGUAGE, 1997, 11 (04): : 321 - 343
  • [7] Generalized mixture of HMMs for continuous speech recognition
    Korkmazskiy, F
    Juang, BH
    Soong, F
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1443 - 1446
  • [8] Ionic Liquid Mixture Design for Carbon Capture using Property Clustering Technique
    Chong, Fah K.
    Chemmangattuvalappil, Nishanth G.
    Foo, Dominic C. Y.
    Atilhan, Mert
    Eljack, Fadwa T.
    [J]. PRES15: PROCESS INTEGRATION, MODELLING AND OPTIMISATION FOR ENERGY SAVING AND POLLUTION REDUCTION, 2015, 45 : 1567 - 1572
  • [9] Lecture Speech Recognition Using Discrete-Mixture HMMs
    Kosaka, Tetsuo
    Yamamoto, Akiyoshi
    Kumakura, Takuya
    Kato, Masaharu
    Kohda, Masaki
    [J]. IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2011, 6 (01) : 23 - 29
  • [10] Robust speech recognition using discrete-mixture HMMs
    Kosaka, T
    Katoh, M
    Kohda, M
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (12): : 2811 - 2818