Speech-in-noise intelligibility improvement based on spectral shaping and dynamic range compression

被引:0
|
作者
Zorila, Tudor-Catalin
Kandia, Varvara
Stylianou, Yannis
机构
关键词
speech-in-noise enhancement; speech intelligibility; spectral shaping; dynamic range compression; CLEAR SPEECH; LISTENERS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we suggest a non-parametric way to improve the intelligibility of speech in noise. The signal is enhanced before presented in a noisy environment, under the constraint of equal global signal power before and after modifications. Two systems are combined in a cascade form to enhance the quality of the signal first in frequency (spectral shaping) and then in time (dynamic range compression). Experiments with speech shaped (SSN) and competing speaker (CS) types of noise at various low SNR values, show that the suggested approach outperforms state-of-the art methods in terms of the Speech Intelligibility Index (SII). In terms of SNR gain there is an improvement of 7 dB (SSN) and 8 dB (CS) over these methods. A formal listening test confirm the efficiency of the suggested system in enhancing speech intelligibility in noise.
引用
收藏
页码:634 / 637
页数:4
相关论文
共 50 条
  • [31] Near and Far Field Speech-in-Noise Intelligibility Improvements Based on a Time-Frequency Energy Reallocation Approach
    Zorila, Tudor-Catalin
    Stylianou, Yannis
    Ishihara, Tatsuma
    Akamine, Masami
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (10) : 1808 - 1818
  • [32] Combining perceptually-motivated spectral shaping with loudness and duration modification for intelligibility enhancement of HMM-based synthetic speech in noise
    Valentini-Botinhao, Cassia
    Yamagishi, Junichi
    King, Simon
    Stylianou, Yannis
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3534 - 3538
  • [33] Sensitivity of the Speech Intelligibility Index to the Assumed Dynamic Range
    Jin, In-Ki
    Kates, James M.
    Arehart, Kathryn H.
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2017, 60 (06): : 1674 - 1680
  • [34] Effects of noise suppression and envelope dynamic range compression on the intelligibility of vocoded sentences for a tonal language
    Chen, Fei
    Zheng, Dingchang
    Tsao, Yu
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 142 (03): : 1157 - 1166
  • [35] Learning static spectral weightings for speech intelligibility enhancement in noise
    Tang, Yan
    Cooke, Martin
    COMPUTER SPEECH AND LANGUAGE, 2018, 49 : 1 - 16
  • [36] Optimal Speech Intelligibility Improvement for Varying Car Noise Characteristics
    Biswas, Ritujoy
    Nathwani, Karan
    Hafiz, Faizal
    Swain, Akshya
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2022, 94 (12): : 1429 - 1446
  • [37] Speech intelligibility improvement in car noise environment by voice transformation
    Nathwani, Karan
    Richard, Gael
    David, Bertrand
    Prablanc, Pierre
    Roussarie, Vincent
    SPEECH COMMUNICATION, 2017, 91 : 17 - 27
  • [38] The dynamic range of speech, compression, and its effect on the speech reception threshold in stationary and interrupted noise
    Rhebergen, Koenraad S.
    Versfeld, Niek J.
    Dreschler, Wouter. A.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 126 (06): : 3236 - 3245
  • [39] FORMANT SHIFTING FOR SPEECH INTELLIGIBILITY IMPROVEMENT IN CAR NOISE ENVIRONMENT
    Nathwani, Karan
    Daniel, Morgane
    Richard, Gael
    David, Bertrand
    Roussarie, Vincent
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5375 - 5379
  • [40] Optimal Speech Intelligibility Improvement for Varying Car Noise Characteristics
    Ritujoy Biswas
    Karan Nathwani
    Faizal Hafiz
    Akshya Swain
    Journal of Signal Processing Systems, 2022, 94 : 1429 - 1446