Acoustic Factor Analysis for Streamed Hidden Markov Modeling

被引:1
|
作者
Chien, Jen-Tzung [1 ]
Ting, Chuan-Wei [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan
关键词
Factor analysis (FA); Markov chain; streamed hidden Markov model; speech recognition; MAXIMUM-LIKELIHOOD; SPEECH; COMBINATION;
D O I
10.1109/TASL.2009.2014891
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel streamed hidden Markov model (HMM) framework for speech recognition. The factor analysis (FA) principle is adopted to explore the common factors from acoustic features. The streaming regularities in building HMMs are governed by the correlation between cepstral features, which is inherent in common factors. Those features corresponding to the same factor are generated by the identical HMM state. Accordingly, the multiple Markov chains are adopted to characterize the variation trends in different dimensions of cepstral vectors. An FA streamed HMM (FASHMM) method is developed to relax the assumption of standard HMM topology, namely, that all features of a speech frame perform the same state emission. The proposed FASHMMis more flexible than the streamed factorial HMM (SFHMM) where the streaming was empirically determined. To reduce the number of factor loading matrices in FA, we evaluated the similarity between individual matrices to find the optimal solution to parameter clustering of FA models. A new decoding algorithm was presented to perform FASHMM speech recognition. FASHMM carries out the streamed Markov chains for a sequence of multivariate Gaussian mixture observations through the state transitions of the partitioned vectors. In the experiments, the proposed method reduced the recognition error rates significantly when compared with the standard HMM and SFHMM methods.
引用
收藏
页码:1279 / 1291
页数:13
相关论文
共 50 条
  • [31] Using hidden Markov modeling in DNA sequencing
    Nelson, Ruben
    Foo, Simon
    Weatherspoon, Mark
    [J]. PROCEEDINGS OF THE 40TH SOUTHEASTERN SYMPOSIUM ON SYSTEM THEORY, 2008, : 215 - 217
  • [32] Spectrum Sensing Using Hidden Markov Modeling
    Coulson, Alan J.
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-8, 2009, : 2732 - 2737
  • [33] Analysis of single-molecule FRET trajectories using hidden Markov modeling
    McKinney, Sean A.
    Joo, Chirlmin
    Ha, Taekjip
    [J]. BIOPHYSICAL JOURNAL, 2006, 91 (05) : 1941 - 1951
  • [34] Task-Evoked Dynamic Network Analysis Through Hidden Markov Modeling
    Quinn, Andrew J.
    Vidaurre, Diego
    Abeysuriya, Romesh
    Becker, Robert
    Nobre, Anna C.
    Woolrich, Mark W.
    [J]. FRONTIERS IN NEUROSCIENCE, 2018, 12
  • [35] Hidden Markov Modeling for Semantic Analysis-On the Combination of Different Decoding Strategies
    Beuschel, Christiane
    Minker, Wolfgang
    Buehler, Dirk
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2005, 8 (03) : 295 - 305
  • [36] Hidden Markov model analysis of motifs in interleukins and haematopoietic growth factor family
    Du Chunjuan
    Zeng Yanjun
    [J]. 2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 6089 - 6091
  • [37] Hidden Markov Model Variational Autoencoder for Acoustic Unit Discovery
    Ebbers, Janek
    Heymann, Jahn
    Drude, Lukas
    Glarner, Thomas
    Haeb-Umbach, Reinhold
    Raj, Bhiksha
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 488 - 492
  • [38] Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery
    Ondel, Lucas
    Vydana, Hari Krishna
    Burget, Lukas
    Cernocky, Jan
    [J]. INTERSPEECH 2019, 2019, : 261 - 265
  • [39] A statistical acoustic confusability metric between Hidden Markov Models
    You, Hong
    Alwan, Abeer
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 745 - +
  • [40] Acoustic Modelling for Speech Recognition: Hidden Markov Models and Beyond?
    Gales, M. J. F.
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 44 - 44