A Study on Hidden Markov Model's Generalization Capability for Speech Recognition

被引:0
|
作者
Xiao, Xiong [1 ]
Li, Jinyu [2 ]
Chng, Eng Siong [1 ]
Li, Haizhou [3 ]
Lee, Chin-Hui [4 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
[2] Microsoft Corp, Redmond, WA 98052 USA
[3] Inst Infocomm Res, Singapore 138632, Singapore
[4] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
关键词
model generalization; robustness; soft margin estimation; minimum classification error; Aurora task;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
From statistical learning theory, the generalization capability of a model is the ability to generalize well on unseen test data which follow the same distribution as the training data. This paper investigates how generalization capability can also improve robustness when testing and training data are from different distributions in the context of speech recognition. Two discriminative training (DT) methods are used to train the hidden Markov model (HMM) for better generalization capability, namely the minimum classification error (MCE) and the soft-margin estimation (SME) methods. Results on Aurora-2 task show that both SME and MCE are effective in improving one of the measures of acoustic model's generalization capability, i.e. the margin of the model, with SME be moderately more effective. In addition, the better generalization capability translates into better robustness of speech recognition performance, even when there is significant mismatch between the training and testing data. We also applied the mean and variance normalization (MVN) to preprocess the data to reduce the training-testing mismatch. After MVN, MCE and SME perform even better as the generalization capability now is more closely related to robustness. The best performance on Aurora-2 is obtained from SME and about 28% relative error rate reduction is achieved over the MVN baseline system. Finally, we also use SME to demonstrate the potential of better generalization capability in improving robustness in more realistic noisy task using the Aurora-3 task, and significant improvements are obtained.
引用
收藏
页码:118 / +
页数:2
相关论文
共 50 条
  • [1] Belief Hidden Markov Model for Speech Recognition
    Jendoubi, Siwar
    Ben Yaghlane, Boutheina
    Martin, Arnaud
    [J]. 2013 5TH INTERNATIONAL CONFERENCE ON MODELING, SIMULATION AND APPLIED OPTIMIZATION (ICMSAO), 2013,
  • [2] Murmured Speech Recognition Using Hidden Markov Model
    Kumar, Rajesh T.
    Videla, Lakshmi Sarvani
    SivaKumar, Soubraylu
    Asalg, Gopala Gupta
    Haritha, D.
    [J]. 2020 7TH IEEE INTERNATIONAL CONFERENCE ON SMART STRUCTURES AND SYSTEMS (ICSSS 2020), 2020, : 53 - 57
  • [3] NEURAL PREDICTIVE HIDDEN MARKOV MODEL FOR SPEECH RECOGNITION
    TSUBOKA, E
    TAKADA, Y
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1995, E78D (06) : 676 - 684
  • [4] INTERFRAME DEPENDENT HIDDEN MARKOV MODEL FOR SPEECH RECOGNITION
    MING, J
    SMITH, FJ
    [J]. ELECTRONICS LETTERS, 1994, 30 (03) : 188 - 189
  • [5] A DOMESTIC SPEECH RECOGNITION BASED ON HIDDEN MARKOV MODEL
    Tao, Jun
    Jiang, Xiaoxiao
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS, 2011, : 606 - 609
  • [6] Predictive hidden Markov model selection for speech recognition
    Chien, JT
    Furui, S
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03): : 377 - 387
  • [7] A Study on the Generalization Capability of Acoustic Models for Robust Speech Recognition
    Xiao, Xiong
    Li, Jinyu
    Chng, Eng Siong
    Li, Haizhou
    Lee, Chin-Hui
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1158 - 1169
  • [8] Partly hidden Markov model and its application to speech recognition
    Kobayashi, T
    Furuyama, J
    Masumitsu, K
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 121 - 124
  • [9] DYNAMIC ADAPTATION OF HIDDEN MARKOV MODEL FOR ROBUST SPEECH RECOGNITION
    GAO, YQ
    CHEN, YB
    WU, BX
    [J]. 1989 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-3, 1989, : 1336 - 1339
  • [10] Hidden Markov model-based speech emotion recognition
    Schuller, B
    Rigoll, G
    Lang, M
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 1 - 4