Intelligibility Enhancement of Casual Speech for Reverberant Environments inspired by Clear Speech Properties

被引:0
|
作者
Koutsogiannaki, Maria [1 ]
Petkov, Petko N. [2 ]
Stylianou, Yannis [1 ,2 ]
机构
[1] Univ Crete, CSD, Multimedia Informat Lab, Iraklion, Greece
[2] Toshiba Res Europe Ltd, Cambridge Res Lab, Kawasaki, Kanagawa, Japan
关键词
Clear Speech; Casual Speech; Intelligibility; Reverberation; Spectral Transformations; Time Modifications; Pause insertion; HARD-OF-HEARING; CONVERSATIONAL SPEECH; VOWEL INTELLIGIBILITY; SPEAKING-RATE; PERCEPTION; TALKER; NOISE;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Clear speech has been shown to have an intelligibility advantage over casual speech in noisy and reverberant environments. This work validates spectral and time domain modifications to increase the intelligibility of casual speech in reverberant environments by compensating particular differences between the two speaking styles. To compensate spectral differences, a frequency-domain filtering approach is applied to casual speech. In time domain, two techniques for time-scaling casual speech are explored: (1) uniform time-scaling and (2) pause insertion and phoneme elongation based on loudness and modulation criteria. The effect of the proposed modifications is evaluated through subjective listening tests in two reverberant conditions with reverberation time 0.8s and 2s. The combination of spectral transformation and uniform time-scaling is shown to be the most successful in increasing the intelligibility of casual speech. The evaluation results support the conclusion that modifications inspired by clear speech can be beneficial for the intelligibility enhancement of speech in reverberant environments.
引用
收藏
页码:65 / 69
页数:5
相关论文
共 50 条
  • [31] Intelligibility of Clear Speech: Effect of Instruction
    Lam, Jennifer
    Tjaden, Kris
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2013, 56 (05): : 1429 - 1440
  • [32] Intelligibility of reverberant noisy speech with ideal binary masking
    Roman, Nicoleta
    Woodruff, John
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2011, 130 (04): : 2153 - 2161
  • [33] Biologically Inspired Algorithm for Enhancement of Speech Intelligibility Over Telephone Channel
    Huang, Dong-Yan
    Rahardja, Susanto
    Ong, Ee Pina
    2009 IEEE INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP 2009), 2009, : 340 - 345
  • [34] Speech enhancement of instantaneous amplitude and phase for applications in noisy reverberant environments
    Liu, Yang
    Nower, Naushin
    Morita, Shota
    Unoki, Masashi
    SPEECH COMMUNICATION, 2016, 84 : 1 - 14
  • [35] Automatic enhancement of speech intelligibility
    Colotte, V
    Laprie, Y
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1057 - 1060
  • [36] Improving intelligibility of speech spoken under reverberant environment conditions: Effect of reverberation frequency characteristics on speech intelligibility
    Kambayashi, Chihiro
    Hodoshima, Nao
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2020, 41 (01) : 418 - 419
  • [37] Speech enhancement by speech intelligibility index In sensor network
    Parija, Smita
    Sahu, Prasanna Kumar
    Singh, Sudhansu Sekhar
    2012 THIRD INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION & NETWORKING TECHNOLOGIES (ICCCNT), 2012,
  • [38] A Speech Preprocessing Method Based on Overlap-Masking Reduction to Increase Intelligibility in Reverberant Environments
    Grosse, Julian
    van de Par, Steven
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2017, 65 (1-2): : 31 - 41
  • [39] Evaluation of speech naturalness with steady-state zero padding for improving intelligibility in reverberant environments
    Matsukaze, Yohei
    Arai, Takayuki
    Suzuki, Toshimasa
    Yasu, Keiichi
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2012, 33 (06) : 370 - 371
  • [40] Using Steady-State Suppression to Improve Speech Intelligibility in Reverberant Environments for Elderly Listeners
    Arai, Takayuki
    Hodoshima, Nao
    Yasu, Keiichi
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (07): : 1775 - 1780