Automatic speech recognition based on weighted minimum classification error (W-MCE) training method

被引：3

作者：

Fu, Qiang ^{[1
]}

Juang, Biing-Hwang ^{[1
]}

机构：

[1] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA

来源：

2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2 | 2007年

关键词：

non-uniform error cost; weighted MCE;

D O I：

10.1109/ASRU.2007.4430124

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Bayes decision theory [1] is the foundation of the classical statistical pattern recognition approach. For most of pattern recognition problems, the Bayes decision theory is employed assuming that the system performance metric is defined as the simple error counting, which assigns identical cost to each recognition error. However, this prevalent performance metric is not desirable in many practical applications. For example, the cost of "recognition" error is required to be differentiated in keyword spotting systems. In this paper, we propose an extended framework for the speech recognition problem with non-uniform classification/recognition error cost. As the system performance metric, the recognition error is weighted based on the task objective. The Bayes decision theory is employed according to this performance metric and the decision rule with a non-uniform error cost function is derived. We argue that the minimum classification error (MCE) method, after appropriate generalization, is the most suitable training algorithm for the "optimal" classifier design to minimize the weighted error rate. We formulate the weighted MCE (W-MCE) algorithm based on the conventional MCE infrastructure by integrating the error cost and the recognition error count into one objective function. In the context of automatic speech recognition (ASR), we present a variety of training scenarios and weighting strategies under this extended framework. The experimental demonstration for large vocabulary continuous speech recognition is provided to support the effectiveness of our approach.

引用

页码：278 / 283

页数：6

共 50 条

[31] An application of minimum classification error to feature space transformations for speech recognition
delaTorre, A
Peinado, AM
Rubio, AJ
Sanchez, VE
Diaz, JE
SPEECH COMMUNICATION, 1996, 20 (3-4) : 273 - 290
[32] An improved approach to robust speech recognition using minimum error classification
Lin, MT
Spanias, A
Loizou, P
SPEECH COMMUNICATION, 2000, 30 (01) : 27 - 36
[33] SUBBAND MINIMUM CLASSIFICATION ERROR BEAMFORMING FOR SPEECH RECOGNITION IN REVERBERANT ENVIRONMENTS
Liao, Yuan-Fu
Xu, I-Yun
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4702 - 4705
[34] Minimum Hypothesis Phone Error as a Decoding Method for Speech Recognition
Xu, Haihua
Povey, Daniel
Zhu, Jie
Wu, Guanyong
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 92 - +
[35] Minimum Classification Error Training Incorporating Automatic Loss Smoothness Determination
Watanabe, Hideyuki
Tokuno, Jun'ichi
Ohashi, Tsukasa
Katagiri, Shigeru
Ohsaki, Miho
Matsuda, Shigeki
Kashioka, Hideki
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2014, 74 (03): : 311 - 322
[36] String-based minimum verification error (SB-MVE) training for speech recognition
AT&T Lab-Research, Murray Hill, United States
Comput Speech Lang, 2 (147-160):
[37] String-based minimum verification error (SB-MVE) training for speech recognition
Rahim, MG
Lee, CH
COMPUTER SPEECH AND LANGUAGE, 1997, 11 (02): : 147 - 160
[38] Minimum Classification Error Training Incorporating Automatic Loss Smoothness Determination
Hideyuki Watanabe
Jun’ichi Tokuno
Tsukasa Ohashi
Shigeru Katagiri
Miho Ohsaki
Shigeki Matsuda
Hideki Kashioka
Journal of Signal Processing Systems, 2014, 74 : 311 - 322
[39] Speech Pattern Classification Using Large Geometric Margin Minimum Classification Error Training
Kitaoka, Mikiyo
Hashimoto, Tetsuya
Ochiai, Tsubasa
Katagiri, Shigeru
Ohsaki, Miho
Watanabe, Hideyuki
Lu, Xugang
Kawai, Hisashi
TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
[40] MINIMUM GENERATION ERROR TRAINING WITH WEIGHTED EUCLIDEAN DISTANCE ON LSP FOR HMM-BASED SPEECH SYNTHESIS
Lei, Ming
Ling, Zhen-Hua
Dai, Li-Rong
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4230 - 4233

← 1 2 3 4 5 →