Predicting Automatic Speech Recognition Performance over Communication Channels from Instrumental Speech Quality and Intelligibility Scores

被引:9
|
作者
Gallardo, Laura Fernandez [1 ]
Moeller, Sebastian [1 ]
Beerends, John [2 ]
机构
[1] TU Berlin, Qual & Usabil Lab, Telekom Innovat Labs, Berlin, Germany
[2] TNO, The Hague, Netherlands
关键词
automatic speech recognition; speech intelligibility; instrumental speech quality; communication channels; ITU-T STANDARD; ASSESSMENT POLQA;
D O I
10.21437/Interspeech.2017-36
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of automatic speech recognition based on coded-decoded speech heavily depends on the quality of the transmitted signals, determined by channel impairments. This paper examines relationships between speech recognition performance and measurements of speech quality and intelligibility over transmission channels. Different to previous studies, the effects of super-wideband transmissions are analyzed and compared to those of wideband and narrowband channels. Furthermore, intelligibility scores. gathered by conducting a listening test based on logatomes. are also considered for the prediction of automatic speech recognition results. The modern instrumental measurement techniques POLQA and POLQA-based intelligibility have been respectively applied to estimate the quality and the intelligibility of transmitted speech. Based on our results. polynomial models are proposed that permit the prediction of speech recognition accuracy from the subjective and instrumental measures. involving a number of channel distortions in the three bandwidths. This approach can save the costs of performing automatic speech recognition experiments and can be seen as a first step towards a useful tool for communication channel designers.
引用
收藏
页码:2939 / 2943
页数:5
相关论文
共 50 条
  • [11] Measuring the intelligibility of dysarthric speech through automatic speech recognition in a pluricentric language
    Xue, Wei
    Cucchiarini, Catia
    van Hout, Roeland
    Strik, Helmer
    SPEECH COMMUNICATION, 2023, 148 : 23 - 30
  • [12] Automatic Speech Recognition Used for Intelligibility Assessment of Text-to-Speech Systems
    Vich, Robert
    Nouza, Jan
    Vondra, Martin
    VERBAL AND NONVERBAL FEATURES OF HUMAN-HUMAN AND HUMAN-MACHINE INTERACTIONS, 2008, 5042 : 136 - +
  • [13] APPLICATION OF SPEECH RECOGNITION TO AUTOMATIC INTELLIGIBILITY TESTING PROCEDURES
    TEACHER, CF
    RICHARDS, JR
    HEWITT, H
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1970, 48 (01): : 131 - &
  • [14] Predicting automatic speech recognition performance using prosodic cues
    Litman, DJ
    Hirschberg, JB
    Swerts, M
    6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP, 2000, : A218 - A225
  • [15] Automatic intelligibility assessment of pathologic speech over the telephone
    Haderlein, Tino
    Noeth, Elmar
    Batliner, Anton
    Eysholdt, Ulrich
    Rosanowski, Frank
    LOGOPEDICS PHONIATRICS VOCOLOGY, 2011, 36 (04) : 175 - 181
  • [16] Predicting the speech recognition performance of elderly individuals with sensorineural hearing impairment - A procedure based on the Speech Intelligibility Index
    Magnusson, L
    SCANDINAVIAN AUDIOLOGY, 1996, 25 (04): : 215 - 222
  • [17] Automatic speech recognition: A communication perspective
    Atal, Bishnu S.
    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1999, 1 : 457 - 460
  • [18] Automatic speech recognition: A communication perspective
    Atal, BS
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 457 - 460
  • [19] Evaluation of speech intelligibility for children with cleft lip and palate by means of automatic speech recognition
    Schuster, Maria
    Maier, Andreas
    Haderlein, Tino
    Nkenke, Emeka
    Wohlleben, Ulrike
    Rosanowski, Frank
    Eysholdt, Ulrich
    Noeth, Elmar
    INTERNATIONAL JOURNAL OF PEDIATRIC OTORHINOLARYNGOLOGY, 2006, 70 (10) : 1741 - 1747
  • [20] Distorted speech rejection for automatic speech recognition in wireless communication
    Chang, JH
    Kim, NS
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (07) : 1978 - 1981