Speech Enhancement of Complex Convolutional Recurrent Network with Attention

被引:0
|
作者
Jiangjiao Zeng
Lidong Yang
机构
[1] Inner Mongolia University Of Science and Technology,School of Information Engineering
关键词
Speech enhancement; Parameter-free attention module; Convolutional recurrent network; Bidirectional gated recurrent unit;
D O I
暂无
中图分类号
学科分类号
摘要
Speech enhancement aims to separate pure speech from noisy speech, to improve speech quality and intelligibility. A complex convolutional recurrent network with a parameter-free attention module is proposed to improve the effect of speech enhancement. First, the feature information is enhanced by improving the convolutional layer of the encoding layer and the decoding layer. Then, the redundant information is suppressed by adding a parameter-free attention module to extract features that are more effective for the speech enhancement task, and the middle layer is selected for the bidirectional gated recurrent unit. Compared with the best of several baseline models, in the Voice Bank + DEMAND dataset, Perceptual Evaluation of Speech Quality (PESQ) increased by 0.17 (6.23%), MOS predictor of intrusiveness of background noise (CBAK) increased by 0.14 (4.34%), (MOS predictor of overall processed speech quality) COVL increased by 0.40 (12.42%), and (MOS predictor of speech distortion) CSIG index increased by 0.57 (15.28%). Experimental results show that the proposed approach has higher theoretical significance and practical value for actual speech enhancement.
引用
收藏
页码:1834 / 1847
页数:13
相关论文
共 50 条
  • [21] FULLY CONVOLUTIONAL RECURRENT NETWORKS FOR SPEECH ENHANCEMENT
    Strake, Maximilian
    Defraene, Bruno
    Fluyt, Kristoff
    Tirry, Wouter
    Fingscheidt, Tim
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6674 - 6678
  • [22] DEEP CONVOLUTIONAL RECURRENT NEURAL NETWORK WITH ATTENTION MECHANISM FOR ROBUST SPEECH EMOTION RECOGNITION
    Huang, Che-Wei
    Narayanan, Shrikanth
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 583 - 588
  • [23] Convolutional recurrent neural network with attention for Vietnamese speech to text problem in the operating room
    Dat T.T.
    Dang L.T.A.
    Sang V.N.T.
    Thuy L.N.L.
    Bao P.T.
    [J]. International Journal of Intelligent Information and Database Systems, 2021, 14 (03) : 294 - 314
  • [24] MULTI-SCALE TEMPORAL FREQUENCY CONVOLUTIONAL NETWORK WITH AXIAL ATTENTION FOR SPEECH ENHANCEMENT
    Zhang, Guochang
    Yu, Libiao
    Wang, Chunliang
    Wei, Jianqiang
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 9122 - 9126
  • [25] Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement
    Zhang, Zehua
    Zhang, Lu
    Zhuang, Xuyi
    Qian, Yukun
    Wang, Mingjiang
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
  • [26] Learning Complex Spectral Mapping With Gated Convolutional Recurrent Networks for Monaural Speech Enhancement
    Tan, Ke
    Wang, DeLiang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 380 - 390
  • [27] Speech enhancement using progressive learning-based convolutional recurrent neural network
    Li, Andong
    Yuan, Minmin
    Zheng, Chengshi
    Li, Xiaodong
    [J]. APPLIED ACOUSTICS, 2020, 166
  • [28] Inplace Gated Convolutional Recurrent Neural Network For Dual-channel Speech Enhancement
    Liu, Jinjiang
    Zhang, Xueliang
    [J]. INTERSPEECH 2021, 2021, : 1852 - 1856
  • [29] WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-End Speech Enhancement
    Hsieh, Tsun-An
    Wang, Hsin-Min
    Lu, Xugang
    Tsao, Yu
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2020, 27 (27) : 2149 - 2153
  • [30] Speech enhancement using deep complex convolutional neural network (DCCNN) model
    Iqbal, Yasir
    Zhang, Tao
    Fahad, Muhammad
    Rahman, Sadiq ur
    Iqbal, Anjum
    Geng, Yanzhang
    Zhao, Xin
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, : 8675 - 8692