Predicting Automatic Speech Recognition Performance over Communication Channels from Instrumental Speech Quality and Intelligibility Scores

被引:9
|
作者
Gallardo, Laura Fernandez [1 ]
Moeller, Sebastian [1 ]
Beerends, John [2 ]
机构
[1] TU Berlin, Qual & Usabil Lab, Telekom Innovat Labs, Berlin, Germany
[2] TNO, The Hague, Netherlands
关键词
automatic speech recognition; speech intelligibility; instrumental speech quality; communication channels; ITU-T STANDARD; ASSESSMENT POLQA;
D O I
10.21437/Interspeech.2017-36
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of automatic speech recognition based on coded-decoded speech heavily depends on the quality of the transmitted signals, determined by channel impairments. This paper examines relationships between speech recognition performance and measurements of speech quality and intelligibility over transmission channels. Different to previous studies, the effects of super-wideband transmissions are analyzed and compared to those of wideband and narrowband channels. Furthermore, intelligibility scores. gathered by conducting a listening test based on logatomes. are also considered for the prediction of automatic speech recognition results. The modern instrumental measurement techniques POLQA and POLQA-based intelligibility have been respectively applied to estimate the quality and the intelligibility of transmitted speech. Based on our results. polynomial models are proposed that permit the prediction of speech recognition accuracy from the subjective and instrumental measures. involving a number of channel distortions in the three bandwidths. This approach can save the costs of performing automatic speech recognition experiments and can be seen as a first step towards a useful tool for communication channel designers.
引用
收藏
页码:2939 / 2943
页数:5
相关论文
共 50 条
  • [21] LEVERAGING AUTOMATIC SPEECH RECOGNITION IN COCHLEAR IMPLANTS FOR IMPROVED SPEECH INTELLIGIBILITY UNDER REVERBERATION
    Hazrati, Oldooz
    Ghaffarzadegan, Shabnam
    Hansen, John H. L.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5093 - 5097
  • [22] Investigation of Automatic Speech Recognition Performance and Mean Opinion Scores for Different Standard Speech and Audio Codecs
    Ramana, A. V.
    Parayitam, Laxminarayana
    Pala, Mythili Sharan
    IETE JOURNAL OF RESEARCH, 2012, 58 (02) : 121 - 129
  • [23] Automatic Speech Recognition Performance for Training on Noised Speech
    Prodeus, Arkadiy
    Kukharicheva, Kateryna
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION AND COMMUNICATION TECHNOLOGIES-2017 (AICT 2017), 2017, : 71 - 74
  • [24] Speech intelligibility and the subjective assessment of speech quality in near real communication conditions
    Volberg, Leonie
    Kulka, Marko
    Sust, Charlotte A.
    Lazarus, Hans
    ACTA ACUSTICA UNITED WITH ACUSTICA, 2006, 92 (03) : 406 - 416
  • [25] Speech intelligibility and the subjective assessment of speech quality in near real communication conditions
    Above GmbH, Dresdener Str. 11, 35435 Wettenberg, Germany
    不详
    Acta Acust. United Acust., 2006, 3 (406-416):
  • [26] ON PREDICTING THE INTELLIGIBILITY OF SPEECH FROM ACOUSTICAL MEASURES
    KRYTER, KD
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1956, 28 (04): : 590 - 590
  • [27] Predicting speech intelligibility from a population of neurons
    Bondy, J
    Bruce, IC
    Becker, S
    Raykin, S
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 1409 - 1416
  • [28] ON PREDICTING THE INTELLIGIBILITY OF SPEECH FROM ACOUSTICAL MEASURES
    KRYTER, KD
    JOURNAL OF SPEECH AND HEARING DISORDERS, 1956, 21 (02): : 208 - 217
  • [29] Intelligibility Rating with Automatic Speech Recognition, Prosodic, and Cepstral Evaluation
    Haderlein, Tino
    Moers, Cornelia
    Moebius, Bernd
    Rosanowski, Frank
    Noeth, Elmar
    TEXT, SPEECH AND DIALOGUE, TSD 2011, 2011, 6836 : 195 - 202
  • [30] Unsupervised Uncertainty Measures of Automatic Speech Recognition for Non-intrusive Speech Intelligibility Prediction
    Tu, Zehai
    Ma, Ning
    Barker, Jon
    INTERSPEECH 2022, 2022, : 3493 - 3497