Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing

被引:156
|
作者
Jorgensen, Soren [1 ]
Dau, Torsten [1 ]
机构
[1] Tech Univ Denmark, Dept Elect Engn, Ctr Appl Hearing Res, DK-2800 Lyngby, Denmark
来源
关键词
MASKING-LEVEL DIFFERENCES; AMPLITUDE-MODULATION; RECEPTION THRESHOLD; TRANSMISSION INDEX; TEMPORAL ENVELOPE; ROOM ACOUSTICS; COMPRESSION; SPECTRUM; RECOGNITION; INTENSITY;
D O I
10.1121/1.3621502
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A model for predicting the intelligibility of processed noisy speech is proposed. The speech-based envelope power spectrum model has a similar structure as the model of Ewert and Dau [(2000). J. Acoust. Soc. Am. 108, 1181-1196], developed to account for modulation detection and masking data. The model estimates the speech-to-noise envelope power ratio, SNRenv, at the output of a modulation filterbank and relates this metric to speech intelligibility using the concept of an ideal observer. Predictions were compared to data on the intelligibility of speech presented in stationary speech-shaped noise. The model was further tested in conditions with noisy speech subjected to reverberation and spectral subtraction. Good agreement between predictions and data was found in all cases. For spectral subtraction, an analysis of the model's internal representation of the stimuli revealed that the predicted decrease of intelligibility was caused by the estimated noise envelope power exceeding that of the speech. The classical concept of the speech transmission index fails in this condition. The results strongly suggest that the signal-to-noise ratio at the output of a modulation frequency selective process provides a key measure of speech intelligibility. [DOI: 10.1121/1.3621502]
引用
收藏
页码:1475 / 1487
页数:13
相关论文
共 50 条
  • [21] IMPROVEMENT IN SIGNAL-TO-NOISE RATIO BY SPAC (SPEECH PROCESSING SYSTEM BY USE OF AUTOCORRELATION FUNCTION
    YOSHIYA, K
    SUZUKI, J
    JOURNAL OF THE RADIO RESEARCH LABORATORY, 1977, 24 (115): : 137 - 148
  • [22] IMPROVEMENT IN SIGNAL-TO-NOISE RATIO BY SPAC (SPEECH PROCESSING SYSTEM USING AUTOCORRELATION FUNCTION)
    YOSHIYA, K
    SUZUKI, J
    ELECTRONICS & COMMUNICATIONS IN JAPAN, 1978, 61 (03): : 18 - 25
  • [23] Selective Frequency Enhancement of Speech Signal for Intelligibility Improvement in Presence of Near-end Noise
    Premananda, B. S.
    Uma, B., V
    PROCEEDINGS OF 4TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND CONTROL(ICAC3'15), 2015, 49 : 244 - 252
  • [24] SIGNAL-TO-NOISE RATIO OF A VIDEO SIGNAL TRANSMITTED BY A FIBEROPTIC SYSTEM USING PULSE-FREQUENCY MODULATION
    TIMMERMAN, CC
    IEEE TRANSACTIONS ON BROADCASTING, 1977, 23 (01) : 12 - 16
  • [25] Speech enhancement based on the modified phase using signal-to-noise ratio information and time-frequency characteristics
    Jia H.
    Wang W.
    Ji H.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (05): : 162 - 170
  • [26] Method for optical signal-to-noise ratio monitoring based on modulation spectrum assessment
    Dlubek, M. P.
    Phillips, A. J.
    Larkins, E. C.
    IET OPTOELECTRONICS, 2009, 3 (02) : 86 - 92
  • [27] Radar Signal Modulation Recognition Based on Split EfficientNet Under Low Signal-to-Noise Ratio
    Li Q.
    Liu W.
    Niu C.-Y.
    Bao Y.-T.
    Hui Z.-B.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (03): : 675 - 686
  • [28] ESTIMATION OF SIGNAL-TO-NOISE RATIO OF WIDE-BAND SPATIAL-FREQUENCY PROCESSING SYSTEM BASED ON FFT
    GUSEV, VG
    RADIOTEKHNIKA I ELEKTRONIKA, 1993, 38 (07): : 1302 - 1310
  • [29] Signal-to-noise ratio measurement for high-power frequency doubled laser system
    Chen, Lanrong
    Zhi, Tingting
    Cai, Xijie
    Tang, Fulin
    Wu, Fengcun
    Zhongguo Jiguang/Chinese Journal of Lasers, 1997, 24 (03): : 228 - 230
  • [30] Accuracy of speech transmission index predictions based on the reverberation time and signal-to-noise ratio
    Galbrun, Laurent
    Kitapci, Kivanc
    APPLIED ACOUSTICS, 2014, 81 : 1 - 14