Speech intelligibility prediction using a Neurogram Similarity Index Measure

被引：60

作者：

Hines, Andrew ^{[1
]}

Harte, Naomi ^{[1
]}

机构：

[1] Trinity Coll Dublin, Sigmedia Grp, Dept Elect & Elect Engn, Dublin, Ireland

来源：

SPEECH COMMUNICATION | 2012年 / 54卷 / 02期

关键词：

Auditory periphery model; Simulated performance intensity function; NSIM; SSIM; Speech Intelligibility; QUALITY ASSESSMENT; PHENOMENOLOGICAL MODEL; TEMPORAL INFORMATION; NORMAL-HEARING; RECOGNITION; RESPONSES; LOUDNESS; PHONEME;

D O I：

10.1016/j.specom.2011.09.004

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Discharge patterns produced by fibres from normal and impaired auditory nerves in response to speech and other complex sounds can be discriminated subjectively through visual inspection. Similarly, responses from auditory nerves where speech is presented at diminishing sound levels progressively deteriorate from those at normal listening levels. This paper presents a Neurogram Similarity Index Measure (NSIM) that automates this inspection process, and translates the response pattern differences into a bounded discrimination metric. Performance intensity functions can be used to provide additional information over measurement of speech reception threshold and maximum phoneme recognition by plotting a test subject's recognition probability over a range of sound intensities. A computational model of the auditory periphery was used to replace the human subject and develop a methodology that simulates a real listener test. The newly developed NSIM is used to evaluate the model outputs in response to Consonant-Vowel-Consonant (CVC) word lists and produce phoneme discrimination scores. The simulated results are rigorously compared to those from normal hearing subjects in both quiet and noise conditions. The accuracy of the tests and the minimum number of word lists necessary for repeatable results is established and the results are compared to predictions using the speech intelligibility index (SII). The experiments demonstrate that the proposed simulated performance intensity function (SPIF) produces results with confidence intervals within the human error bounds expected with real listener tests. This work represents an important step in validating the use of auditory nerve models to predict speech intelligibility. (C) 2011 Elsevier B.V. All rights reserved.

引用

页码：306 / 320

页数：15

共 50 条

[41] Multi-resolution gammachirp envelope distortion index for intelligibility prediction of noisy speech
Yamamoto, Katsuhiko
Irino, Toshio
Ohashi, Narumi
Araki, Shoko
Kinoshita, Keisuke
Nakatani, Tomohiro
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1863 - 1867
[42] OBJECTIVE-MEASURE OF SPEECH-INTELLIGIBILITY USING LINEAR PREDICTIVE CODING
OTTINGER, DM
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 66 (06): : 1902 - 1902
[43] Speech recognition for multiple bands: Implications for the Speech Intelligibility Index
Humes, Larry E.
Kidd, Gary R.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2016, 140 (03): : 2019 - 2026
[44] Primary discussion on speech intelligibility of Chinese and the speech transmission index
SHEN Hao(Institute of Acoustics
Chinese Journal of Acoustics, 1990, (01) : 74 - 81
[45] The speech intelligibility and applicability of the speech transmission index in large spaces
Liu, Hongshan
Ma, Hui
Kang, Jian
Wang, Chao
APPLIED ACOUSTICS, 2020, 167 (167)
[46] The disagreement between speech transmission index (STI) and speech intelligibility
Onaga, H.
Furue, Y.
Ikeda, T.
Acoustical Science and Technology, 2001, 22 (04) : 265 - 271
[47] Matrix sentence intelligibility prediction using an automatic speech recognition system
Schaedler, Marc Rene
Warzybok, Anna
Hochmuth, Sabine
Kollmeier, Birger
INTERNATIONAL JOURNAL OF AUDIOLOGY, 2015, 54 : 100 - 107
[48] Relationship between Chinese speech intelligibility and speech transmission index in rooms using dichotic listening
Peng JianXin
CHINESE SCIENCE BULLETIN, 2008, 53 (18): : 2748 - 2752
[49] Speech Intelligibility Prediction Using Spectro-Temporal Modulation Analysis
Edraki, Amin
Chan, Wai-Yip
Jensen, Jesper
Fogerty, Daniel
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 210 - 225
[50] Relationship between Chinese speech intelligibility and speech transmission index in rooms using dichotic listening
PENG JianXin School of Physics
Science Bulletin, 2008, (18) : 2748 - 2752

← 1 2 3 4 5 →