A Time-domain Monaural Speech Enhancement with Feedback Learning

被引:0
|
作者
Li, Andong [1 ,2 ]
Zheng, Chengshi [1 ,2 ]
Cheng, Linjuan [1 ,2 ]
Peng, Renhua [1 ,2 ]
Li, Xiaodong [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Acoust, Key Lab Noise & Vibrat Res, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
关键词
NOISE;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a type of neural network with feedback learning in the time domain called FTNet for monaural speech enhancement, where the proposed network consists of three principal components. The first part is called stage recurrent neural network, which is introduced to effectively aggregate the deep feature dependencies across different stages with a memory mechanism and also remove the interference stage by stage. The second part is the convolutional auto-encoder. The third part consists of a series of concatenated gated linear units, which are capable of facilitating the information flow and gradually increasing the receptive fields. Feedback learning is adopted to improve the parameter efficiency and therefore, the number of trainable parameters is effectively reduced without sacrificing its performance. Numerous experiments are conducted on TIMIT corpus and experimental results demonstrate that the proposed network can achieve consistently better performance in terms of both PESQ and STOI scores than two state-of-the-art time domain-based baselines in different conditions.
引用
收藏
页码:769 / 774
页数:6
相关论文
共 50 条
  • [1] On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement
    Kolbaek, Morten
    Tan, Zheng-Hua
    Jensen, Soren Holdt
    Jensen, Jesper
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 825 - 838
  • [2] Two-Stage Learning and Fusion Network With Noise Aware for Time-Domain Monaural Speech Enhancement
    Xiang, Xiaoxiao
    Zhang, Xiaojuan
    Chen, Haozhe
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1754 - 1758
  • [3] Group Multi-Scale convolutional Network for Monaural Speech Enhancement in Time-domain
    Yu, Juntao
    Jiang, Ting
    Yu, Jiacheng
    [J]. 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 646 - 650
  • [4] Improved Speech Enhancement using a Time-Domain GAN with Mask Learning
    Lin, Ju
    Niu, Sufeng
    van Wijngaarden, Adriaan J.
    McClendon, Jerome L.
    Smith, Melissa C.
    Wang, Kuang-Ching
    [J]. INTERSPEECH 2020, 2020, : 3286 - 3290
  • [5] Optimizing the Perceptual Quality of Time-Domain Speech Enhancement with Reinforcement Learning
    Xiang Hao
    Chenglin Xu
    Lei Xie
    Haizhou Li
    [J]. Tsinghua Science and Technology, 2022, 27 (06) : 939 - 947
  • [6] Optimizing the Perceptual Quality of Time-Domain Speech Enhancement with Reinforcement Learning
    Hao, Xiang
    Xu, Chenglin
    Xie, Lei
    Li, Haizhou
    [J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2022, 27 (06) : 939 - 947
  • [7] Visually Assisted Time-Domain Speech Enhancement
    Ideli, Elham
    Sharpe, Bruce
    Bajic, Ivan, V
    Vaughan, Rodney G.
    [J]. 2019 7TH IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (IEEE GLOBALSIP), 2019,
  • [8] MAMGAN: Multiscale attention metric GAN for monaural speech enhancement in the time domain
    Guo, Huimin
    Jian, Haifang
    Wang, Yequan
    Wang, Hongchang
    Zhao, Xiaofan
    Zhu, Wenqi
    Cheng, Qinghua
    [J]. APPLIED ACOUSTICS, 2023, 209
  • [9] Adversarial Dictionary Learning for Monaural Speech Enhancement
    Ji, Yunyun
    Xu, Longting
    Zhu, Wei-Ping
    [J]. INTERSPEECH 2020, 2020, : 4034 - 4038
  • [10] Time-domain speech enhancement using generative adversarial networks
    Pascual, Santiago
    Serra, Joan
    Bonafonte, Antonio
    [J]. SPEECH COMMUNICATION, 2019, 114 : 10 - 21