Convolutional density estimation in hidden Markov models for speech recognition

被引:3
|
作者
Matsoukas, S [1 ]
Zavaliagkos, G [1 ]
机构
[1] BBN Syst & Technol Corp, GTE Internetworking, Cambridge, MA 02138 USA
关键词
D O I
10.1109/ICASSP.1999.758075
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In continuous density Hidden Markov Models (HMMs) for speech recognition, the probability density function (pdf) for each state is usually expressed as a mixture of Gaussians. In this paper, we present a model in which the pdf is expressed as the convolution of two densities. We focus on the special case where one of the convolved densities is a M-Gaussian mixture, and the other is a mixture of N impulses. We present the reestimation formulae for the parameters of the M x N convolutional model, and suggest two ways for initializing them, the residual K-Means approach, and the deconvolution from a standard HMM with MN Gaussians per state using a genetic algorithm to search for the optimal assignment of Gaussians. Both methods result in a compact representation that requires only O(M + N) storage space for the model parameters, and O(MN) time for training and decoding. We explain how the decoding time can be reduced to O(M + kN), where k < M. Finally results are shown on the 1996 Hub-rf Development test, demonstrating that a 32 x 2 convolutional model can achieve performance comparable to that of a standard 64-Gaussian per state model.
引用
收藏
页码:113 / 116
页数:4
相关论文
共 50 条
  • [41] Visual speech recognition using Active Shape Models and Hidden Markov Models
    Luettin, J
    Thacker, NA
    Beet, SW
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 817 - 820
  • [42] Robust speech recognition using maximum likelihood neural networks and continuous density Hidden Markov Models
    Yuk, DS
    Che, CW
    Flanagan, J
    [J]. 1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 474 - 481
  • [43] Telephone speech recognition using neural networks and hidden Markov models
    Yuk, D
    Flanagan, J
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 157 - 160
  • [44] Integration of Auxiliary Features in Hidden Markov Models for Arabic Speech Recognition
    Amrous, Anissa Imen
    Debyeche, Mohamed
    Amrouche, A.
    [J]. 2009 3RD INTERNATIONAL CONFERENCE ON SIGNALS, CIRCUITS AND SYSTEMS (SCS 2009), 2009, : 612 - 616
  • [45] STRANDED GAUSSIAN MIXTURE HIDDEN MARKOV MODELS FOR ROBUST SPEECH RECOGNITION
    Zhao, Yong
    Juang, Biing-Hwang
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4301 - 4304
  • [46] Fuzzy Hidden Markov Models for Speech Recognition on based FEM Algorithm
    Taheri, Asghar
    Tarihi, Mohammad Reza
    Baghgar, Hassan
    Abad, Bostan
    Bababeyk, Hassan
    [J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 4, 2005, 4 : 59 - 61
  • [47] Visual speech recognition using motion features and hidden Markov models
    Yau, Wai Chee
    Kumar, Dinesh Kant
    Weghorn, Hans
    [J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS, PROCEEDINGS, 2007, 4673 : 832 - 839
  • [48] Stereophonic speech recognition in noise using compensated hidden Markov models
    Brookes, DM
    Leung, MH
    [J]. ELECTRONICS LETTERS, 1998, 34 (19) : 1827 - 1829
  • [49] Combination of hidden Markov models with dynamic time warping for speech recognition
    Axelrod, S
    Maison, B
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 173 - 176
  • [50] Combination of vector quantization and Hidden Markov Models for Arabic speech recognition
    Bahi, H
    Sellami, M
    [J]. ACS/IEEE INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2001, : 96 - 100