Speech recognition and utterance verification based on a generalized confidence score

被引:28
|
作者
Koo, MW [1 ]
Lee, CH [1 ]
Juang, BH [1 ]
机构
[1] Korea Telecom, Spoken Language Res Team, Multimedia Technol Lab, Seoul 137792, South Korea
来源
关键词
confidence score; speech recognition; utterance verification;
D O I
10.1109/89.966085
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we introduce a generalized confidence score (GCS) function that enables a framework to integrate different confidence scores in speech recognition and utterance verification. A modified decoder based on the GCS is then proposed. The GCS is defined as a combination of various confidence scores obtained by exponential weighting from various confidence information sources, such as likelihood, likelihood ratio, duration, language model probabilities, etc. We also propose the use of a confidence preprocessor to transform raw scores into manageable terms for easy integration. We consider two kinds of hybrid decoders, an ordinary hybrid decoder and an extended hybrid decoder, as implementation examples based on the generalized confidence score. The ordinary hybrid decoder uses a frame-level likelihood ratio in addition to a frame-level likelihood, while a conventional decoder uses only the frame likelihood or likelihood ratio. The extended hybrid decoder uses not only the frame-level likelihood but also multilevel information such as frame-level, phone-level, and word-level confidence scores based on the likelihood ratios. Our experimental evaluation shows that the proposed hybrid decoders give better results than those obtained by the conventional decoders, especially in dealing with ill-formed utterances that contain out-of-vocabulary words and phrases.
引用
收藏
页码:821 / 832
页数:12
相关论文
共 50 条
  • [21] Speech Unit Category based Short Utterance Speaker Recognition
    Fatima, Nakhat
    Wu, Xiaojun
    Zheng, Thomas Fang
    [J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2012, 9 (04) : 1407 - 1430
  • [22] A new decoder based on a generalized confidence score
    Koo, MW
    Lee, CH
    Juang, BH
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 213 - 216
  • [23] UTTERANCE-LEVEL NEURAL CONFIDENCE MEASURE FOR END-TO-END CHILDREN SPEECH RECOGNITION
    Liu, Wei
    Lee, Tan
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 449 - 456
  • [24] Discriminative utterance verification for connected digits recognition
    Rahim, MG
    Lee, CH
    Juang, BH
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (03): : 266 - 277
  • [25] Dynamic classifier combination in hybrid speech recognition systems using utterance-level confidence values
    Kirchhoff, K
    Bilmes, JA
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 693 - 696
  • [26] Utterance verification for spontaneous mandarin speech keyword spotting
    Xin, L
    Wang, BX
    [J]. 2001 INTERNATIONAL CONFERENCES ON INFO-TECH AND INFO-NET PROCEEDINGS, CONFERENCE A-G: INFO-TECH & INFO-NET: A KEY TO BETTER LIFE, 2001, : C397 - C401
  • [27] Prediction of Speech Recognition Accuracy for Utterance Classification
    Korenevsky, Maxim L.
    Smirnov, Andrey B.
    Mendelev, Valentin S.
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1275 - 1279
  • [28] Substate Detection Based Confidence Scoring in Speech Recognition
    Punnoose, A. K.
    [J]. 2020 TWENTY SIXTH NATIONAL CONFERENCE ON COMMUNICATIONS (NCC 2020), 2020,
  • [29] AN SVM BASED CONFIDENCE MEASURE FOR CONTINUOUS SPEECH RECOGNITION
    Bardideh, Mohsen
    Razzazi, Farbod
    Ghassemian, Hassan
    [J]. ICSPC: 2007 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, VOLS 1-3, PROCEEDINGS, 2007, : 1015 - +
  • [30] Bayes-based confidence measure in speech recognition
    Yoma, NB
    Carrasco, J
    Molina, C
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (11) : 745 - 748