An End-to-End Speech Enhancement Method Combining Attention Mechanism to Improve GAN

被引:0
|
作者
Chen, Wei [1 ]
Cai, Yichao [1 ]
Yang, Qingyu [1 ]
Wang, Ge [1 ]
Liu, Taian [1 ]
Liu, Xinying [1 ]
机构
[1] Shandong Univ Sci & Technol, Coll Intelligent Equipment, Tai An, Peoples R China
关键词
Generative Adversarial Networks; time series; attention mechanisms; SEGAN; PESQ; STOI; NOISE; SUPPRESSION; NETWORKS;
D O I
10.1109/IAEAC54830.2022.9929534
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current Generative Adversarial Networks only rely on convolution operations when dealing with speech tasks, ignoring the dependencies between time series and have limited learning ability so that there is still obvious residual noise in the enhanced speech. To solve this problem, an end-to-end speech enhancement method combining attention mechanisms to improve GAN is proposed to apply a combined attention mechanism fusing channel and space between convolutional layers of SEGAN to obtain more contextual information of speech in both channel and space dimensions and extract more accurate feature information. Experimental results demonstrate that the method outperforms the baseline model in both speech quality and intelligibility. The experimental data show that under different signal-to-noise ratios, the perceptual speech quality assessment (PESQ) is improved by an average of 25.72%, and the objective short-term object intelligibility (STOI) is improved by an average of 1.68%.
引用
收藏
页码:538 / 542
页数:5
相关论文
共 50 条
  • [21] Adversarial joint training with self-attention mechanism for robust end-to-end speech recognition
    Li, Lujun
    Kang, Yikai
    Shi, Yuchen
    Kurzinger, Ludwig
    Watzel, Tobias
    Rigoll, Gerhard
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [22] Attention-based latent features for jointly trained end-to-end automatic speech recognition with modified speech enhancement
    Yang, Da-Hee
    Chang, Joon-Hyuk
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (03) : 202 - 210
  • [23] Coarse-Grained Attention Fusion with Joint Training Framework for Complex Speech Enhancement and End-to-End Speech Recognition
    Zhuang, Xuyi
    Zhang, Lu
    Zhang, Zehua
    Qian, Yukun
    Wang, Mingjiang
    INTERSPEECH 2022, 2022, : 3794 - 3798
  • [24] AN END-TO-END SPEECH ACCENT RECOGNITION METHOD BASED ON HYBRID CTC/ATTENTION TRANSFORMER ASR
    Gao, Qiang
    Wu, Haiwei
    Sun, Yanqing
    Duan, Yitao
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7253 - 7257
  • [25] END-TO-END ATTENTION-BASED LARGE VOCABULARY SPEECH RECOGNITION
    Bandanau, Dzmitry
    Chorowski, Jan
    Serdyuk, Dmitriy
    Brakel, Philemon
    Bengio, Yoshua
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 4945 - 4949
  • [26] Gaussian Prediction based Attention for Online End-to-End Speech Recognition
    Hou, Junfeng
    Zhang, Shiliang
    Dai, Lirong
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3692 - 3696
  • [27] Online Hybrid CTC/Attention Architecture for End-to-end Speech Recognition
    Miao, Haoran
    Cheng, Gaofeng
    Zhang, Pengyuan
    Li, Ta
    Yan, Yonghong
    INTERSPEECH 2019, 2019, : 2623 - 2627
  • [28] Adversarial Regularization for Attention Based End-to-End Robust Speech Recognition
    Sun, Sining
    Guo, Pengcheng
    Xie, Lei
    Hwang, Mei-Yuh
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (11) : 1826 - 1838
  • [29] MODALITY ATTENTION FOR END-TO-END AUDIO-VISUAL SPEECH RECOGNITION
    Zhou, Pan
    Yang, Wenwen
    Chen, Wei
    Wang, Yanfeng
    Jia, Jia
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6565 - 6569
  • [30] Large Margin Training for Attention Based End-to-End Speech Recognition
    Wang, Peidong
    Cui, Jia
    Weng, Chao
    Yu, Dong
    INTERSPEECH 2019, 2019, : 246 - 250