Generative factor analyzed HMM for automatic speech recognition

被引:4
|
作者
Yao, KS
Paliwal, KK
Lee, TW
机构
[1] Univ Calif San Diego, Inst Neural Computat, La Jolla, CA 92093 USA
[2] Griffith Univ, Sch Microelect Engn, Brisbane, Qld 4111, Australia
关键词
hidden Markov models; factor analysis; mixture of Gaussian; speech recognition; expectation maximization algorithm;
D O I
10.1016/j.specom.2005.01.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a generative factor analyzed hidden Markov model (GFA-HMM) for automatic speech recognition. In a standard HMM, observation vectors are represented by mixture of Gaussians (MoG) that are dependent on discrete-valued hidden state sequence. The GFA-HMM introduces a hierarchy of continuous-valued latent representation of observation vectors, where latent vectors in one level are acoustic-unit dependent and latent vectors in a higher level are acoustic-unit independent. An expectation maximization (EM) algorithm is derived for maximum likelihood estimation of the model. We show through a set of experiments to verify the potential of the GFA-HMM as an alternative acoustic modeling technique. In one experiment, by varying the latent dimension and the number of mixture components in the latent spaces, the GFA-HMM attained more compact representation than the standard HMM. In other experiments with varies noise types and speaking styles, the GFA-HMM was able to have (statistically significant) improvement with respect to the standard HMM, (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:435 / 454
页数:20
相关论文
共 50 条
  • [1] Factor Analyzed HMM Topology for Speech Recognition
    Ting, Chuan-Wei
    Chien, Jen-Tzung
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1407 - 1410
  • [2] HMM AUTOMATIC SPEECH RECOGNITION SYSTEM OF ARABIC ALPHADIGITS
    Alghamdi, Mansour M.
    Alotaibi, Yousef Ajami
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2010, 35 (2C) : 137 - 155
  • [3] FACTOR ANALYZED VOICE MODELS FOR HMM-BASED SPEECH SYNTHESIS
    Kazumi, Kyosuke
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4234 - 4237
  • [4] A Hybrid HMM/ANN Approach for Automatic Gujarati Speech Recognition
    Valaki, Sanjay
    Jethva, Harikrishna
    2017 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2017,
  • [5] A survey of hybrid ANN/HMM models for automatic speech recognition
    Trentin, E
    Gori, M
    NEUROCOMPUTING, 2001, 37 : 91 - 126
  • [6] Analysis of HMM temporal evolution for automatic speech recognition and verification
    Casar, Marta
    Fonollosa, Jose A. R.
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2006, 4188 : 359 - 366
  • [7] A hybrid HMM/BN acoustic model for automatic speech recognition
    Markov, K
    Nakamura, S
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2003, E86D (03): : 438 - 445
  • [8] Development of HMM Based Automatic Speech Recognition System For Indian English
    Garud, Anushri
    Bang, Arti
    Joshi, Shrikant
    2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [9] DNN-HMM based Automatic Speech Recognition for HRI Scenarios
    Novoa, Jose
    Wuth, Jorge
    Pablo Escudero, Juan
    Fredes, Josue
    Mahu, Rodrigo
    Becerra Yoma, Nestor
    HRI '18: PROCEEDINGS OF THE 2018 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2018, : 150 - 159
  • [10] Incorporating the voicing information into HMM-based automatic speech recognition
    Jancovic, Peter
    Koekueer, Muenevver
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 42 - 46