Word Error Rate Estimation for Speech Recognition: e-WER

被引:0
|
作者
Ali, Ahmed [1 ]
Renals, Steve [2 ]
机构
[1] QCRI, Doha, Qatar
[2] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh, Midlothian, Scotland
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Measuring the performance of automatic speech recognition (ASR) systems requires manually transcribed data in order to compute the word error rate (WER), which is often time-consuming and expensive. In this paper, we propose a novel approach to estimate WER, or e-WER, which does not require a gold-standard transcription of the test set. Our e-WER framework uses a comprehensive set of features: ASR recognised text, character recognition results to complement recognition output, and internal decoder features. We report results for the two features; black-box and glass-box using unseen 24 Arabic broadcast programs. Our system achieves 16.9% WER root mean squared error (RMSE) across 1,400 sentences. The estimated overall WER eWER was 25.3% for the three hours test set, while the actual WER was 28.5%.
引用
收藏
页码:20 / 24
页数:5
相关论文
共 50 条
  • [1] Word Error Rate Estimation Without ASR Output: e-WER2N
    Ali, Ahmed
    Renals, Steve
    [J]. INTERSPEECH 2020, 2020, : 616 - 620
  • [2] Optimizing expected word error rate via sampling for speech recognition
    Shannon, Matt
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3537 - 3541
  • [3] Microphone array driven speech recognition:: Influence of localization on the word error rate
    Wölfel, M
    Nickel, K
    McDonough, J
    [J]. MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3869 : 320 - 331
  • [4] PREDICTING WORD ERROR RATE FOR REVERBERANT SPEECH
    Gamper, Hannes
    Emmanouilidou, Dimitra
    Braun, Sebastian
    Tashev, Ivan J.
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 491 - 495
  • [5] A study on model-based error rate estimation for automatic speech recognition
    Huang, CS
    Wang, HC
    Lee, CH
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (06): : 581 - 589
  • [6] Accounting for Speech Rate in Spoken Word Recognition
    Li, David Cheng-Huan
    Kaiser, Elsi
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2007 - 2010
  • [7] Word Error Rate Comparison between Single and Double Radar Solutions for Silent Speech Recognition
    Lee, Sunghwa
    Seo, Jiwon
    [J]. 2019 19TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2019), 2019, : 1211 - 1214
  • [8] PHONETICALLY-ORIENTED WORD ERROR ALIGNMENT FOR SPEECH RECOGNITION ERROR ANALYSIS IN SPEECH TRANSLATION
    Ruiz, Nicholas
    Federico, Marcello
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 296 - 302
  • [9] Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
    Meng, Zhong
    Wu, Yu
    Kanda, Naoyuki
    Lu, Liang
    Chen, Xie
    Ye, Guoli
    Sun, Eric
    Li, Jinyu
    Gong, Yifan
    [J]. INTERSPEECH 2021, 2021, : 2596 - 2600
  • [10] Joint estimation of confidence and error causes in speech recognition
    Ogawa, Atsunori
    Nakamura, Atsushi
    [J]. SPEECH COMMUNICATION, 2012, 54 (09) : 1014 - 1028