Speech recognition and utterance verification based on a generalized confidence score

被引：28

作者：

Koo, MW ^{[1
]}

Lee, CH ^{[1
]}

Juang, BH ^{[1
]}

机构：

[1] Korea Telecom, Spoken Language Res Team, Multimedia Technol Lab, Seoul 137792, South Korea

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2001年 / 9卷 / 08期

关键词：

confidence score; speech recognition; utterance verification;

D O I：

10.1109/89.966085

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we introduce a generalized confidence score (GCS) function that enables a framework to integrate different confidence scores in speech recognition and utterance verification. A modified decoder based on the GCS is then proposed. The GCS is defined as a combination of various confidence scores obtained by exponential weighting from various confidence information sources, such as likelihood, likelihood ratio, duration, language model probabilities, etc. We also propose the use of a confidence preprocessor to transform raw scores into manageable terms for easy integration. We consider two kinds of hybrid decoders, an ordinary hybrid decoder and an extended hybrid decoder, as implementation examples based on the generalized confidence score. The ordinary hybrid decoder uses a frame-level likelihood ratio in addition to a frame-level likelihood, while a conventional decoder uses only the frame likelihood or likelihood ratio. The extended hybrid decoder uses not only the frame-level likelihood but also multilevel information such as frame-level, phone-level, and word-level confidence scores based on the likelihood ratios. Our experimental evaluation shows that the proposed hybrid decoders give better results than those obtained by the conventional decoders, especially in dealing with ill-formed utterances that contain out-of-vocabulary words and phrases.

引用

页码：821 / 832

页数：12

共 50 条

[1] UTTERANCE CLASSIFICATION CONFIDENCE IN AUTOMATIC SPEECH RECOGNITION
KIMBALL, R
ROTHKOPF, MH
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (02): : 188 - 189
[2] Confidence Score Based Conformer Speaker Adaptation for Speech Recognition
Deng, Jiajun
Xie, Xurong
Wang, Tianzi
Cui, Mingyu
Xue, Boyang
Jin, Zengrui
Geng, Mengzhe
Li, Guinan
Liu, Xunying
Meng, Helen
[J]. INTERSPEECH 2022, 2022, : 2623 - 2627
[3] Utterance verification in continuous speech recognition: Decoding and training procedures
Lleida, E
Rose, RC
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (02): : 126 - 139
[4] A new hybrid decoding algorithm for speech recognition and utterance verification
Koo, MW
Lee, CH
Juang, BH
[J]. 1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 303 - 310
[5] Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition
Sukkar, RA
Lee, CH
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (06): : 420 - 429
[6] Utterance Confidence Measure for End-to-End Speech Recognition with Applications to Distributed Speech Recognition Scenarios
Kumar, Ankur
Singh, Sachin
Gowda, Dhananjaya
Garg, Abhinav
Singh, Shatrughan
Kim, Chanwoo
[J]. INTERSPEECH 2020, 2020, : 4357 - 4361
[7] Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Deng, Jiajun
Xie, Xurong
Wang, Tianzi
Cui, Mingyu
Xue, Boyang
Jin, Zengrui
Li, Guinan
Hu, Shujie
Liu, Xunying
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1175 - 1190
[8] A confidence-score based unsupervised map adaptation for speech recognition
Wang, DG
Narayanan, SS
[J]. THIRTY-SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS - CONFERENCE RECORD, VOLS 1 AND 2, CONFERENCE RECORD, 2002, : 222 - 226
[9] Overlapping One-Class SVMs for Utterance Verification in Speech Recognition
Hou, Cuiqin
Hou, Yibin
Huang, Zhangqin
Liu, Qian
[J]. TRUSTCOM 2011: 2011 INTERNATIONAL JOINT CONFERENCE OF IEEE TRUSTCOM-11/IEEE ICESS-11/FCST-11, 2011, : 1500 - 1504
[10] Efficient decoding and training procedures for utterance verification in continuous speech recognition
Lleida, E
Rose, RC
[J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 507 - 510

← 1 2 3 4 5 →