Word Error Rate Estimation for Speech Recognition: e-WER

被引：0

作者：

Ali, Ahmed ^{[1
]}

Renals, Steve ^{[2
]}

机构：

[1] QCRI, Doha, Qatar

[2] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh, Midlothian, Scotland

来源：

PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2 | 2018年

关键词：

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Measuring the performance of automatic speech recognition (ASR) systems requires manually transcribed data in order to compute the word error rate (WER), which is often time-consuming and expensive. In this paper, we propose a novel approach to estimate WER, or e-WER, which does not require a gold-standard transcription of the test set. Our e-WER framework uses a comprehensive set of features: ASR recognised text, character recognition results to complement recognition output, and internal decoder features. We report results for the two features; black-box and glass-box using unseen 24 Arabic broadcast programs. Our system achieves 16.9% WER root mean squared error (RMSE) across 1,400 sentences. The estimated overall WER eWER was 25.3% for the three hours test set, while the actual WER was 28.5%.

引用

页码：20 / 24

页数：5

共 50 条

[1] Word Error Rate Estimation Without ASR Output: e-WER2N
Ali, Ahmed
Renals, Steve
[J]. INTERSPEECH 2020, 2020, : 616 - 620
[2] Optimizing expected word error rate via sampling for speech recognition
Shannon, Matt
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3537 - 3541
[3] Microphone array driven speech recognition:: Influence of localization on the word error rate
Wölfel, M
Nickel, K
McDonough, J
[J]. MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3869 : 320 - 331
[4] PREDICTING WORD ERROR RATE FOR REVERBERANT SPEECH
Gamper, Hannes
Emmanouilidou, Dimitra
Braun, Sebastian
Tashev, Ivan J.
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 491 - 495
[5] A study on model-based error rate estimation for automatic speech recognition
Huang, CS
Wang, HC
Lee, CH
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (06): : 581 - 589
[6] Accounting for Speech Rate in Spoken Word Recognition
Li, David Cheng-Huan
Kaiser, Elsi
[J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2007 - 2010
[7] Word Error Rate Comparison between Single and Double Radar Solutions for Silent Speech Recognition
Lee, Sunghwa
Seo, Jiwon
[J]. 2019 19TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2019), 2019, : 1211 - 1214
[8] PHONETICALLY-ORIENTED WORD ERROR ALIGNMENT FOR SPEECH RECOGNITION ERROR ANALYSIS IN SPEECH TRANSLATION
Ruiz, Nicholas
Federico, Marcello
[J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 296 - 302
[9] Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
Meng, Zhong
Wu, Yu
Kanda, Naoyuki
Lu, Liang
Chen, Xie
Ye, Guoli
Sun, Eric
Li, Jinyu
Gong, Yifan
[J]. INTERSPEECH 2021, 2021, : 2596 - 2600
[10] Joint estimation of confidence and error causes in speech recognition
Ogawa, Atsunori
Nakamura, Atsushi
[J]. SPEECH COMMUNICATION, 2012, 54 (09) : 1014 - 1028

← 1 2 3 4 5 →