Automatic speech recognition based on weighted minimum classification error (W-MCE) training method

被引:3
|
作者
Fu, Qiang [1 ]
Juang, Biing-Hwang [1 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
关键词
non-uniform error cost; weighted MCE;
D O I
10.1109/ASRU.2007.4430124
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Bayes decision theory [1] is the foundation of the classical statistical pattern recognition approach. For most of pattern recognition problems, the Bayes decision theory is employed assuming that the system performance metric is defined as the simple error counting, which assigns identical cost to each recognition error. However, this prevalent performance metric is not desirable in many practical applications. For example, the cost of "recognition" error is required to be differentiated in keyword spotting systems. In this paper, we propose an extended framework for the speech recognition problem with non-uniform classification/recognition error cost. As the system performance metric, the recognition error is weighted based on the task objective. The Bayes decision theory is employed according to this performance metric and the decision rule with a non-uniform error cost function is derived. We argue that the minimum classification error (MCE) method, after appropriate generalization, is the most suitable training algorithm for the "optimal" classifier design to minimize the weighted error rate. We formulate the weighted MCE (W-MCE) algorithm based on the conventional MCE infrastructure by integrating the error cost and the recognition error count into one objective function. In the context of automatic speech recognition (ASR), we present a variety of training scenarios and weighting strategies under this extended framework. The experimental demonstration for large vocabulary continuous speech recognition is provided to support the effectiveness of our approach.
引用
收藏
页码:278 / 283
页数:6
相关论文
共 50 条
  • [31] An application of minimum classification error to feature space transformations for speech recognition
    delaTorre, A
    Peinado, AM
    Rubio, AJ
    Sanchez, VE
    Diaz, JE
    SPEECH COMMUNICATION, 1996, 20 (3-4) : 273 - 290
  • [32] An improved approach to robust speech recognition using minimum error classification
    Lin, MT
    Spanias, A
    Loizou, P
    SPEECH COMMUNICATION, 2000, 30 (01) : 27 - 36
  • [33] SUBBAND MINIMUM CLASSIFICATION ERROR BEAMFORMING FOR SPEECH RECOGNITION IN REVERBERANT ENVIRONMENTS
    Liao, Yuan-Fu
    Xu, I-Yun
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4702 - 4705
  • [34] Minimum Hypothesis Phone Error as a Decoding Method for Speech Recognition
    Xu, Haihua
    Povey, Daniel
    Zhu, Jie
    Wu, Guanyong
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 92 - +
  • [35] Minimum Classification Error Training Incorporating Automatic Loss Smoothness Determination
    Watanabe, Hideyuki
    Tokuno, Jun'ichi
    Ohashi, Tsukasa
    Katagiri, Shigeru
    Ohsaki, Miho
    Matsuda, Shigeki
    Kashioka, Hideki
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2014, 74 (03): : 311 - 322
  • [36] String-based minimum verification error (SB-MVE) training for speech recognition
    AT&T Lab-Research, Murray Hill, United States
    Comput Speech Lang, 2 (147-160):
  • [37] String-based minimum verification error (SB-MVE) training for speech recognition
    Rahim, MG
    Lee, CH
    COMPUTER SPEECH AND LANGUAGE, 1997, 11 (02): : 147 - 160
  • [38] Minimum Classification Error Training Incorporating Automatic Loss Smoothness Determination
    Hideyuki Watanabe
    Jun’ichi Tokuno
    Tsukasa Ohashi
    Shigeru Katagiri
    Miho Ohsaki
    Shigeki Matsuda
    Hideki Kashioka
    Journal of Signal Processing Systems, 2014, 74 : 311 - 322
  • [39] Speech Pattern Classification Using Large Geometric Margin Minimum Classification Error Training
    Kitaoka, Mikiyo
    Hashimoto, Tetsuya
    Ochiai, Tsubasa
    Katagiri, Shigeru
    Ohsaki, Miho
    Watanabe, Hideyuki
    Lu, Xugang
    Kawai, Hisashi
    TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
  • [40] MINIMUM GENERATION ERROR TRAINING WITH WEIGHTED EUCLIDEAN DISTANCE ON LSP FOR HMM-BASED SPEECH SYNTHESIS
    Lei, Ming
    Ling, Zhen-Hua
    Dai, Li-Rong
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4230 - 4233