Speech recognition and utterance verification based on a generalized confidence score

被引:28
|
作者
Koo, MW [1 ]
Lee, CH [1 ]
Juang, BH [1 ]
机构
[1] Korea Telecom, Spoken Language Res Team, Multimedia Technol Lab, Seoul 137792, South Korea
来源
关键词
confidence score; speech recognition; utterance verification;
D O I
10.1109/89.966085
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we introduce a generalized confidence score (GCS) function that enables a framework to integrate different confidence scores in speech recognition and utterance verification. A modified decoder based on the GCS is then proposed. The GCS is defined as a combination of various confidence scores obtained by exponential weighting from various confidence information sources, such as likelihood, likelihood ratio, duration, language model probabilities, etc. We also propose the use of a confidence preprocessor to transform raw scores into manageable terms for easy integration. We consider two kinds of hybrid decoders, an ordinary hybrid decoder and an extended hybrid decoder, as implementation examples based on the generalized confidence score. The ordinary hybrid decoder uses a frame-level likelihood ratio in addition to a frame-level likelihood, while a conventional decoder uses only the frame likelihood or likelihood ratio. The extended hybrid decoder uses not only the frame-level likelihood but also multilevel information such as frame-level, phone-level, and word-level confidence scores based on the likelihood ratios. Our experimental evaluation shows that the proposed hybrid decoders give better results than those obtained by the conventional decoders, especially in dealing with ill-formed utterances that contain out-of-vocabulary words and phrases.
引用
收藏
页码:821 / 832
页数:12
相关论文
共 50 条
  • [1] UTTERANCE CLASSIFICATION CONFIDENCE IN AUTOMATIC SPEECH RECOGNITION
    KIMBALL, R
    ROTHKOPF, MH
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (02): : 188 - 189
  • [2] Confidence Score Based Conformer Speaker Adaptation for Speech Recognition
    Deng, Jiajun
    Xie, Xurong
    Wang, Tianzi
    Cui, Mingyu
    Xue, Boyang
    Jin, Zengrui
    Geng, Mengzhe
    Li, Guinan
    Liu, Xunying
    Meng, Helen
    [J]. INTERSPEECH 2022, 2022, : 2623 - 2627
  • [3] Utterance verification in continuous speech recognition: Decoding and training procedures
    Lleida, E
    Rose, RC
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (02): : 126 - 139
  • [4] A new hybrid decoding algorithm for speech recognition and utterance verification
    Koo, MW
    Lee, CH
    Juang, BH
    [J]. 1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 303 - 310
  • [5] Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition
    Sukkar, RA
    Lee, CH
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (06): : 420 - 429
  • [6] Utterance Confidence Measure for End-to-End Speech Recognition with Applications to Distributed Speech Recognition Scenarios
    Kumar, Ankur
    Singh, Sachin
    Gowda, Dhananjaya
    Garg, Abhinav
    Singh, Shatrughan
    Kim, Chanwoo
    [J]. INTERSPEECH 2020, 2020, : 4357 - 4361
  • [7] Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
    Deng, Jiajun
    Xie, Xurong
    Wang, Tianzi
    Cui, Mingyu
    Xue, Boyang
    Jin, Zengrui
    Li, Guinan
    Hu, Shujie
    Liu, Xunying
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1175 - 1190
  • [8] A confidence-score based unsupervised map adaptation for speech recognition
    Wang, DG
    Narayanan, SS
    [J]. THIRTY-SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS - CONFERENCE RECORD, VOLS 1 AND 2, CONFERENCE RECORD, 2002, : 222 - 226
  • [9] Overlapping One-Class SVMs for Utterance Verification in Speech Recognition
    Hou, Cuiqin
    Hou, Yibin
    Huang, Zhangqin
    Liu, Qian
    [J]. TRUSTCOM 2011: 2011 INTERNATIONAL JOINT CONFERENCE OF IEEE TRUSTCOM-11/IEEE ICESS-11/FCST-11, 2011, : 1500 - 1504
  • [10] Efficient decoding and training procedures for utterance verification in continuous speech recognition
    Lleida, E
    Rose, RC
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 507 - 510