Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing

被引:156
|
作者
Jorgensen, Soren [1 ]
Dau, Torsten [1 ]
机构
[1] Tech Univ Denmark, Dept Elect Engn, Ctr Appl Hearing Res, DK-2800 Lyngby, Denmark
来源
关键词
MASKING-LEVEL DIFFERENCES; AMPLITUDE-MODULATION; RECEPTION THRESHOLD; TRANSMISSION INDEX; TEMPORAL ENVELOPE; ROOM ACOUSTICS; COMPRESSION; SPECTRUM; RECOGNITION; INTENSITY;
D O I
10.1121/1.3621502
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A model for predicting the intelligibility of processed noisy speech is proposed. The speech-based envelope power spectrum model has a similar structure as the model of Ewert and Dau [(2000). J. Acoust. Soc. Am. 108, 1181-1196], developed to account for modulation detection and masking data. The model estimates the speech-to-noise envelope power ratio, SNRenv, at the output of a modulation filterbank and relates this metric to speech intelligibility using the concept of an ideal observer. Predictions were compared to data on the intelligibility of speech presented in stationary speech-shaped noise. The model was further tested in conditions with noisy speech subjected to reverberation and spectral subtraction. Good agreement between predictions and data was found in all cases. For spectral subtraction, an analysis of the model's internal representation of the stimuli revealed that the predicted decrease of intelligibility was caused by the estimated noise envelope power exceeding that of the speech. The classical concept of the speech transmission index fails in this condition. The results strongly suggest that the signal-to-noise ratio at the output of a modulation frequency selective process provides a key measure of speech intelligibility. [DOI: 10.1121/1.3621502]
引用
收藏
页码:1475 / 1487
页数:13
相关论文
共 50 条
  • [31] A Fast Signal Parameter Estimation Algorithm for Linear Frequency Modulation Signal under Low Signal-to-Noise Ratio Based on Fractional Fourier Transform
    Liu Limin
    Li Haoxin
    Li Qi
    Han Zhuangzhi
    Gao Zhenbin
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (10) : 2798 - 2804
  • [32] Deep Learning-Based Modulation Recognition for Low Signal-to-Noise Ratio Environments
    He, Peng
    Zhang, Yang
    Yang, Xinyue
    Xiao, Xiao
    Wang, Haolin
    Zhang, Rongsheng
    ELECTRONICS, 2022, 11 (23)
  • [33] Method to reduce the signal-to-noise ratio required for modulation recognition based on logarithmic properties
    Xing, Zhe
    Gao, Yong
    IET COMMUNICATIONS, 2018, 12 (11) : 1360 - 1366
  • [34] Improvement of the Signal-to-Noise Ratio of the Clock Signal for the Frequency Standard Based on 113Cd+ ions
    Miao, K.
    Zhang, J. W.
    Wang, S. G.
    Wang, Z. B.
    Wang, L. J.
    2014 IEEE INTERNATIONAL FREQUENCY CONTROL SYMPOSIUM (FCS), 2014, : 563 - 565
  • [35] A FEATURE STUDY FOR CLASSIFICATION-BASED SPEECH SEPARATION AT VERY LOW SIGNAL-TO-NOISE RATIO
    Chen, Jitong
    Wang, Yuxuan
    Wang, DeLiang
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [36] Stabilization over Frequency-Selective Channels Subject to Transmission Delay and Signal-to-Noise Ratio Limitations
    Barforooshan, Mohsen
    Esfanjani, Reza Mahboobi
    COMPLEXITY, 2016, 21 (S1) : 557 - 565
  • [37] Gain, Signal-to-Noise Ratio and Power Optimization of Envelope Detector for Ultra-Low-Power Wake-Up Receiver
    Reyes, Linder
    Silveira, Fernando
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2019, 66 (10) : 1703 - 1707
  • [38] Linear frequency modulation photoacoustic radar: Optimal bandwidth and signal-to-noise ratio for frequency-domain imaging of turbid media
    Lashkari, Bahman
    Mandelis, Andreas
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2011, 130 (03): : 1313 - 1324
  • [39] Blind Signal-to-Noise Ratio Estimation of Speech Based on Vector Quantizer Classifiers and Decision Level Fusion
    Russell Ondusko
    Matthew Marbach
    Ravi P. Ramachandran
    Linda M. Head
    Journal of Signal Processing Systems, 2017, 89 : 335 - 345
  • [40] Blind Signal-to-Noise Ratio Estimation of Speech Based on Vector Quantizer Classifiers and Decision Level Fusion
    Ondusko, Russell
    Marbach, Matthew
    Ramachandran, Ravi P.
    Head, Linda M.
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2017, 89 (02): : 335 - 345