Predicting Automatic Speech Recognition Performance over Communication Channels from Instrumental Speech Quality and Intelligibility Scores

被引：9

作者：

Gallardo, Laura Fernandez ^{[1
]}

Moeller, Sebastian ^{[1
]}

Beerends, John ^{[2
]}

机构：

[1] TU Berlin, Qual & Usabil Lab, Telekom Innovat Labs, Berlin, Germany

[2] TNO, The Hague, Netherlands

来源：

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年

关键词：

automatic speech recognition; speech intelligibility; instrumental speech quality; communication channels; ITU-T STANDARD; ASSESSMENT POLQA;

D O I：

10.21437/Interspeech.2017-36

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The performance of automatic speech recognition based on coded-decoded speech heavily depends on the quality of the transmitted signals, determined by channel impairments. This paper examines relationships between speech recognition performance and measurements of speech quality and intelligibility over transmission channels. Different to previous studies, the effects of super-wideband transmissions are analyzed and compared to those of wideband and narrowband channels. Furthermore, intelligibility scores. gathered by conducting a listening test based on logatomes. are also considered for the prediction of automatic speech recognition results. The modern instrumental measurement techniques POLQA and POLQA-based intelligibility have been respectively applied to estimate the quality and the intelligibility of transmitted speech. Based on our results. polynomial models are proposed that permit the prediction of speech recognition accuracy from the subjective and instrumental measures. involving a number of channel distortions in the three bandwidths. This approach can save the costs of performing automatic speech recognition experiments and can be seen as a first step towards a useful tool for communication channel designers.

引用

页码：2939 / 2943

页数：5

共 50 条

[11] Measuring the intelligibility of dysarthric speech through automatic speech recognition in a pluricentric language
Xue, Wei
Cucchiarini, Catia
van Hout, Roeland
Strik, Helmer
SPEECH COMMUNICATION, 2023, 148 : 23 - 30
[12] Automatic Speech Recognition Used for Intelligibility Assessment of Text-to-Speech Systems
Vich, Robert
Nouza, Jan
Vondra, Martin
VERBAL AND NONVERBAL FEATURES OF HUMAN-HUMAN AND HUMAN-MACHINE INTERACTIONS, 2008, 5042 : 136 - +
[13] APPLICATION OF SPEECH RECOGNITION TO AUTOMATIC INTELLIGIBILITY TESTING PROCEDURES
TEACHER, CF
RICHARDS, JR
HEWITT, H
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1970, 48 (01): : 131 - &
[14] Predicting automatic speech recognition performance using prosodic cues
Litman, DJ
Hirschberg, JB
Swerts, M
6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP, 2000, : A218 - A225
[15] Automatic intelligibility assessment of pathologic speech over the telephone
Haderlein, Tino
Noeth, Elmar
Batliner, Anton
Eysholdt, Ulrich
Rosanowski, Frank
LOGOPEDICS PHONIATRICS VOCOLOGY, 2011, 36 (04) : 175 - 181
[16] Predicting the speech recognition performance of elderly individuals with sensorineural hearing impairment - A procedure based on the Speech Intelligibility Index
Magnusson, L
SCANDINAVIAN AUDIOLOGY, 1996, 25 (04): : 215 - 222
[17] Automatic speech recognition: A communication perspective
Atal, Bishnu S.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1999, 1 : 457 - 460
[18] Automatic speech recognition: A communication perspective
Atal, BS
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 457 - 460
[19] Evaluation of speech intelligibility for children with cleft lip and palate by means of automatic speech recognition
Schuster, Maria
Maier, Andreas
Haderlein, Tino
Nkenke, Emeka
Wohlleben, Ulrike
Rosanowski, Frank
Eysholdt, Ulrich
Noeth, Elmar
INTERNATIONAL JOURNAL OF PEDIATRIC OTORHINOLARYNGOLOGY, 2006, 70 (10) : 1741 - 1747
[20] Distorted speech rejection for automatic speech recognition in wireless communication
Chang, JH
Kim, NS
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (07) : 1978 - 1981

← 1 2 3 4 5 →