Increasing Speech Intelligibility via Spectral Shaping with Frequency Warping and Dynamic Range Compression plus Transient Enhancement

被引:0
|
作者
Godoy, Elizabeth [1 ]
Stylianou, Yannis [1 ]
机构
[1] Fdn Res & Technol Hellas, Inst Comp Sci, Iraklion, Greece
关键词
speech intelligibility; spectral shaping; frequency warping; dynamic range compression; HARD-OF-HEARING; CLEAR; PERCEPTION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to make speech (natural or synthetic) more intelligible for listeners in real-world noisy environments, various modifications have been proposed that exploit spectral and temporal signal features. Previously, an evaluation campaign involving several approaches illustrated that a Spectral Shaping (SS) and Dynamic Range Compression (DRC) method proved highly successful at increasing speech intelligibility. For the public follow-up campaign (i.e., the Hurricane Challenge), this work introduces additional modifications into SSDRC in an attempt to further enhance intelligibility. First aiming to slow down the articulation rate, the speech is uniformly time stretched to effectively increase signal redundancy. Second, a frequency warping mechanism to expand vowel space is incorporated into the SS. Third, scaling to enhance the transient regions of speech is applied in the time-domain along with DRC. Objective and extensive subjective (i.e., the Hurricane Challenge) evaluations show that the new approach successfully achieves intelligibility gains over natural speech for all of the noise conditions evaluated, though compared to SSDRC, there is less advantage observed at higher SNR.
引用
收藏
页码:3539 / 3543
页数:5
相关论文
共 33 条
  • [1] Speech-in-noise intelligibility improvement based on spectral shaping and dynamic range compression
    Zorila, Tudor-Catalin
    Kandia, Varvara
    Stylianou, Yannis
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 634 - 637
  • [2] Speech-in-noise enhancement using amplification and dynamic range compression controlled by the speech intelligibility index
    Schepker, Henning
    Rennies, Jan
    Doclo, Simon
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2015, 138 (05): : 2692 - 2706
  • [3] On combining frequency warping and spectral shaping in HMM based speech recognition
    Potamianos, A
    Rose, RC
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1275 - 1278
  • [4] Effects of upper-frequency boundary and spectral warping on speech intelligibility in electrical stimulation
    Goupell, Matthew J.
    Laback, Bernhard
    Majdak, Piotr
    Baumgartner, Wolf-Dieter
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 123 (04): : 2295 - 2309
  • [5] Effects of upper-frequency boundary and spectral warping on speech intelligibility in electrical stimulation
    Goupell, Matthew J.
    Laback, Bernhard
    Majdak, Piotr
    Baumgartner, Wolf-Dieter
    Journal of the Acoustical Society of America, 2008, 123 (04): : 2295 - 2309
  • [6] Dynamic-range compression using digital frequency warping
    Kates, JM
    CONFERENCE RECORD OF THE THIRTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1 AND 2, 2003, : 715 - 719
  • [7] Characterizing Speech Intelligibility in Noise After Wide Dynamic Range Compression
    Rhebergen, Koenraad S.
    Maalderink, Thijs H.
    Dreschler, Wouter A.
    EAR AND HEARING, 2017, 38 (02): : 194 - 204
  • [8] SPEECH INTELLIGIBILITY ENHANCEMENT USING NON-PARALLEL SPEAKING STYLE CONVERSION WITH STARGAN AND DYNAMIC RANGE COMPRESSION
    Li, Gang
    Hu, Ruimin
    Ke, Shanfa
    Zhang, Rui
    Wang, Xiaochen
    Gao, Li
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [9] Multichannel dynamic-range compression using digital frequency warping
    Kates, JM
    Arehart, KH
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2005, 2005 (18) : 3003 - 3014
  • [10] Multichannel Dynamic-Range Compression Using Digital Frequency Warping
    James M. Kates
    Kathryn Hoberg Arehart
    EURASIP Journal on Advances in Signal Processing, 2005