Improving Speech Intelligibility in Monaural Segregation System by Fusing Voiced and Unvoiced Speech Segments

被引:8
|
作者
Shoba, S. [1 ]
Rajavel, R. [1 ]
机构
[1] SSN Coll Engn, Old Mahabalipuram Rd, Chennai 603110, Tamil Nadu, India
关键词
Speech segregation; CASA; Segments fusion; Segmentation; Speech intelligibility;
D O I
10.1007/s00034-018-1005-3
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Improving the speech intelligibility remains a challenging problem in digital hearing aids. This research work proposes a new speech segregation algorithm to improve the speech intelligibility by effectively fusing the voiced and unvoiced segment of the speech signal using the genetic algorithm. The voiced speech segments are obtained using perceptual speech cues such as auto-correlation, cross-channel correlation and pitch. Similarly, the unvoiced speech segments are obtained using another perceptual speech cue onset/offset after subtracting the voiced segments. The speech onset- and offset-based segregation process actually produce segments for both voiced and unvoiced. The unvoiced speech segments are obtained by subtracting the voiced speech segments from the segments obtained using speech onset and offset. The unvoiced speech segments obtained using onset and offset may contain interference. This research work proposes a scheme to remove those interferences from the unvoiced speech segments and effectively fuse the segments of voiced and unvoiced speech using the genetic algorithm. The performance of the proposed algorithm is evaluated using the intelligibility measures such as CSII, NCM and STOI. The experimental results show that the proposed algorithm significantly improves the speech intelligibility with an average of 0.23 for CSII, 0.20 for NCM and 0.16 for STOI as compared with other existing systems.
引用
收藏
页码:3573 / 3590
页数:18
相关论文
共 50 条
  • [1] Improving Speech Intelligibility in Monaural Segregation System by Fusing Voiced and Unvoiced Speech Segments
    S. Shoba
    R. Rajavel
    [J]. Circuits, Systems, and Signal Processing, 2019, 38 : 3573 - 3590
  • [2] Two-speaker Voiced/Unvoiced Decision for Monaural Speech
    Zeremdini, Jihen
    Ben Messaoud, Mohamed Anouar
    Bouzid, Aicha
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2020, 39 (09) : 4399 - 4415
  • [3] Two-speaker Voiced/Unvoiced Decision for Monaural Speech
    Jihen Zeremdini
    Mohamed Anouar Ben Messaoud
    Aicha Bouzid
    [J]. Circuits, Systems, and Signal Processing, 2020, 39 : 4399 - 4415
  • [4] Significance of voiced and unvoiced speech segments for the detection of common cold
    Warule, Pankaj
    Mishra, Siba Prasad
    Deb, Suman
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (05) : 1785 - 1792
  • [5] Segregation of voiced and unvoiced components from residual of speech signal
    JO Cheol-woo
    KIM Jae-hee
    [J]. Journal of Central South University, 2012, 19 (02) : 496 - 503
  • [6] Segregation of voiced and unvoiced components from residual of speech signal
    Jo, Cheol-woo
    Kim, Jae-hee
    [J]. JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2012, 19 (02) : 496 - 503
  • [7] Significance of voiced and unvoiced speech segments for the detection of common cold
    Pankaj Warule
    Siba Prasad Mishra
    Suman Deb
    [J]. Signal, Image and Video Processing, 2023, 17 : 1785 - 1792
  • [8] Segregation of voiced and unvoiced components from residual of speech signal
    Cheol-woo Jo
    Jae-hee Kim
    [J]. Journal of Central South University, 2012, 19 : 496 - 503
  • [9] THE USE OF CUMULANTS FOR VOICED-UNVOICED SEGMENTS IDENTIFICATION IN SPEECH SIGNALS
    Uslu, Baran
    Tora, Hakan
    [J]. 2014 22ND SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2014, : 971 - 974
  • [10] Monaural Voiced Speech Segregation Based on Dynamic Harmonic Function
    Zhang, Xueliang
    Liu, Wenju
    Xu, Bo
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2010,