An End-to-End Speech Enhancement Method Combining Attention Mechanism to Improve GAN

被引:0
|
作者
Chen, Wei [1 ]
Cai, Yichao [1 ]
Yang, Qingyu [1 ]
Wang, Ge [1 ]
Liu, Taian [1 ]
Liu, Xinying [1 ]
机构
[1] Shandong Univ Sci & Technol, Coll Intelligent Equipment, Tai An, Peoples R China
关键词
Generative Adversarial Networks; time series; attention mechanisms; SEGAN; PESQ; STOI; NOISE; SUPPRESSION; NETWORKS;
D O I
10.1109/IAEAC54830.2022.9929534
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current Generative Adversarial Networks only rely on convolution operations when dealing with speech tasks, ignoring the dependencies between time series and have limited learning ability so that there is still obvious residual noise in the enhanced speech. To solve this problem, an end-to-end speech enhancement method combining attention mechanisms to improve GAN is proposed to apply a combined attention mechanism fusing channel and space between convolutional layers of SEGAN to obtain more contextual information of speech in both channel and space dimensions and extract more accurate feature information. Experimental results demonstrate that the method outperforms the baseline model in both speech quality and intelligibility. The experimental data show that under different signal-to-noise ratios, the perceptual speech quality assessment (PESQ) is improved by an average of 25.72%, and the objective short-term object intelligibility (STOI) is improved by an average of 1.68%.
引用
收藏
页码:538 / 542
页数:5
相关论文
共 50 条
  • [31] Speaker Adaptation for Attention-Based End-to-End Speech Recognition
    Meng, Zhong
    Gaur, Yashesh
    Li, Jinyu
    Gong, Yifan
    INTERSPEECH 2019, 2019, : 241 - 245
  • [32] Noise-robust Attention Learning for End-to-End Speech Recognition
    Higuchi, Yosuke
    Tawara, Naohiro
    Ogawa, Atsunori
    Iwata, Tomoharu
    Kobayashi, Tetsunori
    Ogawa, Tetsuji
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 311 - 315
  • [33] ATTENTION-BASED END-TO-END SPEECH RECOGNITION ON VOICE SEARCH
    Shan, Changhao
    Zhang, Junbo
    Wang, Yujun
    Xie, Lei
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4764 - 4768
  • [34] END-TO-END SPEECH SUMMARIZATION USING RESTRICTED SELF-ATTENTION
    Sharma, Roshan
    Palaskar, Shruti
    Black, Alan W.
    Metze, Florian
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8072 - 8076
  • [35] Efficient decoding self-attention for end-to-end speech synthesis
    Zhao, Wei
    Xu, Li
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2022, 23 (07) : 1127 - 1138
  • [36] A DUAL-STAGED CONTEXT AGGREGATION METHOD TOWARDS EFFICIENT END-TO-END SPEECH ENHANCEMENT
    Zhen, Kai
    Lee, Mi Suk
    Kim, Minje
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 366 - 370
  • [37] Gated End-to-End Memory Network Based on Attention Mechanism
    Zhou, Bin
    Dang, Xin
    2018 INTERNATIONAL CONFERENCE ON ORANGE TECHNOLOGIES (ICOT), 2018,
  • [38] End-to-End Point Cloud Completion Network with Attention Mechanism
    Li, Yaqin
    Han, Binbin
    Zeng, Shan
    Xu, Shengyong
    Yuan, Cao
    SENSORS, 2022, 22 (17)
  • [39] A Convolutional Network With Multi-Scale and Attention Mechanisms for End-to-End Single-Channel Speech Enhancement
    Xiang, Xiaoxiao
    Zhang, Xiaojuan
    Chen, Haozhe
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1455 - 1459
  • [40] Speech Enhancement via Attention Masking Network (SEAMNET): An End-to-End System for Joint Suppression of Noise and Reverberation
    Borgstrom, Bengt J.
    Brandstein, Michael S.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 515 - 526