Speech recognition and utterance verification based on a generalized confidence score

被引：28

作者：

Koo, MW ^{[1
]}

Lee, CH ^{[1
]}

Juang, BH ^{[1
]}

机构：

[1] Korea Telecom, Spoken Language Res Team, Multimedia Technol Lab, Seoul 137792, South Korea

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2001年 / 9卷 / 08期

关键词：

confidence score; speech recognition; utterance verification;

D O I：

10.1109/89.966085

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we introduce a generalized confidence score (GCS) function that enables a framework to integrate different confidence scores in speech recognition and utterance verification. A modified decoder based on the GCS is then proposed. The GCS is defined as a combination of various confidence scores obtained by exponential weighting from various confidence information sources, such as likelihood, likelihood ratio, duration, language model probabilities, etc. We also propose the use of a confidence preprocessor to transform raw scores into manageable terms for easy integration. We consider two kinds of hybrid decoders, an ordinary hybrid decoder and an extended hybrid decoder, as implementation examples based on the generalized confidence score. The ordinary hybrid decoder uses a frame-level likelihood ratio in addition to a frame-level likelihood, while a conventional decoder uses only the frame likelihood or likelihood ratio. The extended hybrid decoder uses not only the frame-level likelihood but also multilevel information such as frame-level, phone-level, and word-level confidence scores based on the likelihood ratios. Our experimental evaluation shows that the proposed hybrid decoders give better results than those obtained by the conventional decoders, especially in dealing with ill-formed utterances that contain out-of-vocabulary words and phrases.

引用

页码：821 / 832

页数：12

共 50 条

[21] Speech Unit Category based Short Utterance Speaker Recognition
Fatima, Nakhat
Wu, Xiaojun
Zheng, Thomas Fang
[J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2012, 9 (04) : 1407 - 1430
[22] A new decoder based on a generalized confidence score
Koo, MW
Lee, CH
Juang, BH
[J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 213 - 216
[23] UTTERANCE-LEVEL NEURAL CONFIDENCE MEASURE FOR END-TO-END CHILDREN SPEECH RECOGNITION
Liu, Wei
Lee, Tan
[J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 449 - 456
[24] Discriminative utterance verification for connected digits recognition
Rahim, MG
Lee, CH
Juang, BH
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (03): : 266 - 277
[25] Dynamic classifier combination in hybrid speech recognition systems using utterance-level confidence values
Kirchhoff, K
Bilmes, JA
[J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 693 - 696
[26] Utterance verification for spontaneous mandarin speech keyword spotting
Xin, L
Wang, BX
[J]. 2001 INTERNATIONAL CONFERENCES ON INFO-TECH AND INFO-NET PROCEEDINGS, CONFERENCE A-G: INFO-TECH & INFO-NET: A KEY TO BETTER LIFE, 2001, : C397 - C401
[27] Prediction of Speech Recognition Accuracy for Utterance Classification
Korenevsky, Maxim L.
Smirnov, Andrey B.
Mendelev, Valentin S.
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1275 - 1279
[28] Substate Detection Based Confidence Scoring in Speech Recognition
Punnoose, A. K.
[J]. 2020 TWENTY SIXTH NATIONAL CONFERENCE ON COMMUNICATIONS (NCC 2020), 2020,
[29] AN SVM BASED CONFIDENCE MEASURE FOR CONTINUOUS SPEECH RECOGNITION
Bardideh, Mohsen
Razzazi, Farbod
Ghassemian, Hassan
[J]. ICSPC: 2007 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, VOLS 1-3, PROCEEDINGS, 2007, : 1015 - +
[30] Bayes-based confidence measure in speech recognition
Yoma, NB
Carrasco, J
Molina, C
[J]. IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (11) : 745 - 748

← 1 2 3 4 5 →