Predicting Automatic Speech Recognition Performance over Communication Channels from Instrumental Speech Quality and Intelligibility Scores

被引:9
|
作者
Gallardo, Laura Fernandez [1 ]
Moeller, Sebastian [1 ]
Beerends, John [2 ]
机构
[1] TU Berlin, Qual & Usabil Lab, Telekom Innovat Labs, Berlin, Germany
[2] TNO, The Hague, Netherlands
关键词
automatic speech recognition; speech intelligibility; instrumental speech quality; communication channels; ITU-T STANDARD; ASSESSMENT POLQA;
D O I
10.21437/Interspeech.2017-36
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of automatic speech recognition based on coded-decoded speech heavily depends on the quality of the transmitted signals, determined by channel impairments. This paper examines relationships between speech recognition performance and measurements of speech quality and intelligibility over transmission channels. Different to previous studies, the effects of super-wideband transmissions are analyzed and compared to those of wideband and narrowband channels. Furthermore, intelligibility scores. gathered by conducting a listening test based on logatomes. are also considered for the prediction of automatic speech recognition results. The modern instrumental measurement techniques POLQA and POLQA-based intelligibility have been respectively applied to estimate the quality and the intelligibility of transmitted speech. Based on our results. polynomial models are proposed that permit the prediction of speech recognition accuracy from the subjective and instrumental measures. involving a number of channel distortions in the three bandwidths. This approach can save the costs of performing automatic speech recognition experiments and can be seen as a first step towards a useful tool for communication channel designers.
引用
收藏
页码:2939 / 2943
页数:5
相关论文
共 50 条
  • [41] Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures
    Moore, A. H.
    Parada, P. Peso
    Naylor, P. A.
    COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 574 - 584
  • [42] Validity of Off-the-Shelf Automatic Speech Recognition for Assessing Speech Intelligibility and Speech Severity in Speakers With Amyotrophic Lateral Sclerosis
    Gutz, Sarah E.
    Stipancic, Kaila L.
    Yunusova, Yana
    Berry, James D.
    Green, Jordan R.
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2022, 65 (05): : 2128 - 2143
  • [43] Improving the Quality of Automatic Speech Recognition in Trucks
    Korenevsky, Maxim
    Medennikov, Ivan
    Shchemelinin, Vadim
    Speech and Computer, 2016, 9811 : 362 - 369
  • [44] Increasing Context for Estimating Confidence Scores in Automatic Speech Recognition
    Ragni, Anton
    Gales, Mark J. F.
    Rose, Oliver
    Knill, Katherine M.
    Kastanos, Alexandros
    Li, Qiujia
    Ness, Preben M.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1319 - 1329
  • [45] A Study on Lattice Rescoring with Knowledge Scores for Automatic Speech Recognition
    Siniscalchi, Sabato Marco
    Li, Jinyu
    Lee, Chin-Hui
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 517 - 520
  • [46] From Speech Quality Measures to Speaker Recognition Performance
    Bello, Claudia
    Ribas, Dayana
    Calvo, Jose R.
    Ferrer, Carlos A.
    PROGRESS IN PATTERN RECOGNITION IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2014, 2014, 8827 : 199 - 206
  • [47] Predicting the Intelligibility of Cochlear-implant Vocoded Speech from Objective Quality Measure
    Chen, Fei
    JOURNAL OF MEDICAL AND BIOLOGICAL ENGINEERING, 2012, 32 (03) : 189 - 193
  • [48] Contribution from the accuracy of phoneme recognition to the quality of automatic recognition of Russian speech
    Karpukhin I.A.
    Moscow University Computational Mathematics and Cybernetics, 2016, 40 (2) : 89 - 95
  • [49] Performance Analysis and Optimization of Automatic Speech Recognition
    Tabani, Hamid
    Arnau, Jose-Maria
    Tubella, Jordi
    Gonzalez, Antonio
    IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS, 2018, 4 (04): : 847 - 860
  • [50] Automatic Speech Recognition Predicts Speech Intelligibility and Comprehension for Listeners With Simulated Age-Related Hearing Loss
    Fontan, Lionel
    Ferrane, Isabelle
    Farinas, Jerome
    Pinquier, Julien
    Tardieu, Julien
    Magnen, Cynthia
    Gaillard, Pascal
    Aumont, Xavier
    Fullgrabee, Christian
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2017, 60 (09): : 2394 - 2405