Automatic speech recognition based on weighted minimum classification error (W-MCE) training method

被引:3
|
作者
Fu, Qiang [1 ]
Juang, Biing-Hwang [1 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
关键词
non-uniform error cost; weighted MCE;
D O I
10.1109/ASRU.2007.4430124
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Bayes decision theory [1] is the foundation of the classical statistical pattern recognition approach. For most of pattern recognition problems, the Bayes decision theory is employed assuming that the system performance metric is defined as the simple error counting, which assigns identical cost to each recognition error. However, this prevalent performance metric is not desirable in many practical applications. For example, the cost of "recognition" error is required to be differentiated in keyword spotting systems. In this paper, we propose an extended framework for the speech recognition problem with non-uniform classification/recognition error cost. As the system performance metric, the recognition error is weighted based on the task objective. The Bayes decision theory is employed according to this performance metric and the decision rule with a non-uniform error cost function is derived. We argue that the minimum classification error (MCE) method, after appropriate generalization, is the most suitable training algorithm for the "optimal" classifier design to minimize the weighted error rate. We formulate the weighted MCE (W-MCE) algorithm based on the conventional MCE infrastructure by integrating the error cost and the recognition error count into one objective function. In the context of automatic speech recognition (ASR), we present a variety of training scenarios and weighting strategies under this extended framework. The experimental demonstration for large vocabulary continuous speech recognition is provided to support the effectiveness of our approach.
引用
收藏
页码:278 / 283
页数:6
相关论文
共 50 条
  • [41] Structural Classification Methods Based on Weighted Finite-State Transducers for Automatic Speech Recognition
    Kubo, Yotaro
    Watanabe, Shinji
    Hori, Takaaki
    Nakamura, Atsushi
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (08): : 2240 - 2251
  • [42] Towards a Generic Approach for Automatic Speech Recognition Error Detection and Classification
    Errattahi, Rahhal
    El Hannani, Asmaa
    Hain, Thomas
    Ouahmane, Hassan
    2018 4TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2018,
  • [43] Minimum Classification Error training of Hidden Markov Models for handwriting recognition
    Biem, AE
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 1529 - 1532
  • [44] Minimum based noise suppression for improved automatic speech recognition
    Fernández, J
    Meyer, C
    Fischer, A
    PROCEEDINGS OF THE SIXTH IASTED INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING, 2004, : 243 - 248
  • [45] MINIMUM PHONE ERROR BASED STREAM WEIGHT TRAINING FOR MANDARIN AUDIO-VISUAL SPEECH RECOGNITION
    Wu, Guanyong
    Zhu, Jie
    Xu, Haihua
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 902 - 905
  • [46] PROTOTYPE-BASED MINIMUM CLASSIFICATION ERROR GENERALIZED PROBABILISTIC DESCENT TRAINING FOR VARIOUS SPEECH UNITS
    MCDERMOTT, E
    KATAGIRI, S
    COMPUTER SPEECH AND LANGUAGE, 1994, 8 (04): : 351 - 368
  • [47] A simple error classification system for understanding sources of error in automatic speech recognition and human transcription
    Zafar, A
    Mamlin, B
    Perkins, S
    Belsito, AM
    Overhage, JM
    McDonald, CJ
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2004, 73 (9-10) : 719 - 730
  • [48] MINIMUM CLASSIFICATION ERROR TRAINING WITH GEOMETRIC MARGIN ENHANCEMENT FOR ROBUST PATTERN RECOGNITION
    Watanabe, Hideyuki
    Katagiri, Shigeru
    Ohsaki, Miho
    2011 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2011,
  • [49] Minimum Phoneme Error based filter bank analysis for speech recognition
    Huang, Hao
    Zhu, Jie
    2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO - ICME 2006, VOLS 1-5, PROCEEDINGS, 2006, : 1081 - +
  • [50] Optimized discriminative transformations for speech features based on minimum classification error
    Zamani, Behzad
    Akbari, Ahmad
    Nasersharif, Babak
    Jalalvand, Azarakhsh
    PATTERN RECOGNITION LETTERS, 2011, 32 (07) : 948 - 955