A Noise-Aware Memory-Attention Network Architecture for Regression-Based Speech Enhancement

被引:0
|
作者
Wang, Yu-Xuan [1 ]
Du, Jun [1 ]
Chai, Li [1 ]
Lee, Chin-Hui [2 ]
Pan, Jia [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Georgia Inst Technol, Atlanta, GA 30332 USA
来源
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
attention mechanism; memory block; noiseaware training; LSTM-RNN; speech enhancement;
D O I
10.21437/Interspeech.2020-2037
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
We propose a novel noise-aware memory-attention network (NAMAN) for regression-based speech enhancement, aiming at improving quality of enhanced speech in unseen noise conditions. The NAMAN architecture consists of three parts, a main regression network, a memory block and an attention block. First, a long short-term memory recurrent neural network (LSTM-RNN) is adopted as the main network to well model the acoustic context of neighboring frames. Next, the memory block is built with an extensive set of noise feature vectors as the prior noise bases. Finally, the attention block serves as an auxiliary network to improve the noise awareness of the main network by encoding the dynamic noise information at frame level through additional features obtained by weighing the existing noise basis vectors in the memory block. Our experiments show that the proposed NAMAN framework is compact and outperforms the state-of-the-art dynamic noise-aware training approaches in low SNR conditions.
引用
收藏
页码:4501 / 4505
页数:5
相关论文
共 50 条
  • [1] NAAGN: Noise-aware Attention-gated Network for Speech Enhancement
    Deng, Feng
    Jiang, Tao
    Wang, Xiao-Rui
    Zhang, Chen
    Li, Yan
    [J]. INTERSPEECH 2020, 2020, : 2457 - 2461
  • [2] NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
    Wang, Wen
    Yang, Dongchao
    Ye, Qichen
    Cao, Bowen
    Zou, Yuexian
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2416 - 2423
  • [3] Regression-Based Speech Enhancement by Convolutional Neural Network
    Erseven, Mustafa
    Bolat, Bulent
    [J]. 2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [4] VARIATIONAL AUTOENCODER FOR SPEECH ENHANCEMENT WITH A NOISE-AWARE ENCODER
    Fang, Huajian
    Carbajal, Guillaume
    Wermter, Stefan
    Gerkmann, Timo
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 676 - 680
  • [6] An Improved Fully Convolutional Network Based on Post-Processing with Global Variance Equalization and Noise-Aware Training for Speech Enhancement
    Li, Wenlong
    Hirota, Kaoru
    Dai, Yaping
    Jia, Zhiyang
    [J]. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2021, 25 (01) : 130 - 137
  • [7] Regression-Based Noise Modeling for Speech Signal Processing
    de Abreu, Caio Cesar Enside
    Duarte, Marco Aparecido Queiroz
    de Oliveira, Bruno Rodrigues
    Vieira Filho, Jozue
    Villarreal, Francisco
    [J]. FLUCTUATION AND NOISE LETTERS, 2021, 20 (03):
  • [8] AN ANALYSIS OF NOISE-AWARE FEATURES IN COMBINATION WITH THE SIZE AND DIVERSITY OF TRAINING DATA FOR DNN-BASED SPEECH ENHANCEMENT
    Rehr, Robert
    Gerkmann, Timo
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 601 - 605
  • [9] Twofold dynamic attention guided deep network and noise-aware mechanism for image denoising
    Chen, Zihao
    Raj, Alex Noel Joseph
    Rajangam, Vijayarajan
    Li, Wei
    Mahesh, Vijayalakshmi G. V.
    Zhuang, Zhemin
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (03) : 87 - 102
  • [10] Noise-aware infrared polarization image fusion based on salient prior with attention-guided filtering network
    Li, Kunyuan
    Qi, Meibin
    Zhuang, Shuo
    Liu, Yimin
    Gao, Jun
    [J]. OPTICS EXPRESS, 2023, 31 (16) : 25781 - 25796