Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain

被引:42
|
作者
Relano-Iborra, Helia [1 ]
May, Tobias [1 ]
Zaar, Johannes [1 ]
Scheidiger, Christoph [1 ]
Dau, Torsten [1 ]
机构
[1] Tech Univ Denmark, Dept Elect Engn, Hearing Syst Grp, DK-2800 Lyngby, Denmark
来源
关键词
RECEPTION THRESHOLD; AMPLITUDE-MODULATION; TRANSMISSION INDEX; FREQUENCY; NOISE; MASKING; MODEL;
D O I
10.1121/1.4964505
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A speech intelligibility prediction model is proposed that combines the auditory processing front end of the multi-resolution speech-based envelope power spectrum model [ mr-sEPSM; Jorgensen, Ewert, and Dau ( 2013). J. Acoust. Soc. Am. 134( 1), 436-446] with a correlation back end inspired by the short-time objective intelligibility measure [ STOI; Taal, Hendriks, Heusdens, and Jensen ( 2011). IEEE Trans. Audio Speech Lang. Process. 19( 7), 2125-2136]. This "hybrid" model, named sEPSM(corr), is shown to account for the effects of stationary and fluctuating additive interferers as well as for the effects of non-linear distortions, such as spectral subtraction, phase jitter, and ideal time frequency segregation ( ITFS). The model shows a broader predictive range than both the original mr-sEPSM ( which fails in the phase-jitter and ITFS conditions) and STOI ( which fails to predict the influence of fluctuating interferers), albeit with lower accuracy than the source models in some individual conditions. Similar to other models that employ a short-term correlation-based back end, including STOI, the proposed model fails to account for the effects of room reverberation on speech intelligibility. Overall, the model might be valuable for evaluating the effects of a large range of interferers and distortions on speech intelligibility, including consequences of hearing impairment and hearing-instrument signal processing. (C) 2016 Author(s).
引用
收藏
页码:2670 / 2679
页数:10
相关论文
共 50 条
  • [31] Effects of lowpass and highpass filtering on the intelligibility of speech based on temporal fine structure or envelope cues
    Ardoint, Marine
    Lorenzi, Christian
    HEARING RESEARCH, 2010, 260 (1-2) : 89 - 95
  • [32] Speech intelligibility improvement in noisy environments based on energy correlation in frequency bands
    Goli, Peyman
    Karami-mollaei, Mohammad Reza
    DIGITAL SIGNAL PROCESSING, 2017, 62 : 238 - 248
  • [33] Efficient methods in LPA using power spectrum estimation of envelope of speech signal
    R.G.M. College of Engineering, Nandyal, Kurnool Dist, A.P., India
    不详
    不详
    Inf. Technol. J., 2007, 2 (300-303):
  • [34] Speech pause detection for noise spectrum estimation by tracking power envelope dynamics
    Marzinzik, M
    Kollmeier, B
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (02): : 109 - 118
  • [35] Estimates of speech intelligibility based on equivalent speech- and noise-spectrum levels and hearing thresholds
    3958 Sherwood Rd., Victoria, BC V8N 4E6, Canada
    不详
    不详
    不详
    Can Acoust, 3 (112-113):
  • [36] A NEW MASK-BASED OBJECTIVE MEASURE FOR PREDICTING THE INTELLIGIBILITY OF BINARY MASKED SPEECH
    Yu, Chengzhu
    Wojcicki, Kamil K.
    Loizou, P. C.
    Hansen, John H. L.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7030 - 7033
  • [37] Predicting the speech recognition performance of elderly individuals with sensorineural hearing impairment - A procedure based on the Speech Intelligibility Index
    Magnusson, L
    SCANDINAVIAN AUDIOLOGY, 1996, 25 (04): : 215 - 222
  • [38] Enhancement of speech intelligibility under noisy reverberant conditions based on modulation spectrum concept
    Van Ngo, Thuan
    Ho, Tuan Vu
    Unoki, Masashi
    Kubo, Rieko
    Akagi, Masato
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 753 - 758
  • [39] Improving Speech Intelligibility in Noise Using a Binary Mask That Is Based on Magnitude Spectrum Constraints
    Kim, Gibak
    Loizou, Philipos C.
    IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (12) : 1010 - 1013
  • [40] A New Metric for the Envelope of OFDM Signals Based on Memory Power Amplifier
    Zhang, Xiangyin
    Zhu, Xiaodong
    Tang, Youxi
    WIRELESS PERSONAL COMMUNICATIONS, 2014, 79 (03) : 1911 - 1923