DENSELY CONNECTED NETWORK WITH TIME-FREQUENCY DILATED CONVOLUTION FOR SPEECH ENHANCEMENT

被引:0
|
作者
Li, Yaxing [1 ]
Li, Xiaoqi [1 ]
Dong, Yuanjie [1 ]
Li, Meng [1 ]
Xu, Shan [1 ]
Xiong, Shengwu [1 ]
机构
[1] Wuhan Univ Technol, Sch Comp Sci & Technol, Wuhan, Hubei, Peoples R China
关键词
Dense connectivity; dilated convolution; speech enhancement;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The data driven speech enhancement approaches using regression-based deep neural network usually result in enormous number of model parameters, which increase the computational load and the difficulty of model training. In order to improve the model efficiency, we propose a densely connected network with time-frequency (T-F) dilated convolution for speech enhancement. The T-F dilated convolution block is designed to enlarge the receptive field and capture the contextual information in both temporal and frequency domains. Considering the computational efficiency, the 1-D convolution with the bottleneck structure is exploited in the T-F convolution block. Each T-F convolution block is then densely connected to ensure maximum information flow between layers and alleviate the vanishing gradient problem of the network. The experimental results reveal that the proposed scheme not only improves the computational efficiency significantly but also produces satisfactory enhancement performance comparing the competing methods.
引用
收藏
页码:6860 / 6864
页数:5
相关论文
共 50 条
  • [1] DENSELY CONNECTED NEURAL NETWORK WITH DILATED CONVOLUTIONS FOR REAL-TIME SPEECH ENHANCEMENT IN THE TIME DOMAIN
    Pandey, Ashutosh
    Wang, DeLiang
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6629 - 6633
  • [2] A time-frequency smoothing neural network for speech enhancement
    Yuan, Wenhao
    [J]. SPEECH COMMUNICATION, 2020, 124 : 75 - 84
  • [3] Hybrid Dilated and Recursive Recurrent Convolution Network for Time-Domain Speech Enhancement
    Song, Zhendong
    Ma, Yupeng
    Tan, Fang
    Feng, Xiaoyi
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [4] Single image resolution enhancement by efficient dilated densely connected residual network
    Shamsolmoali, Pourya
    Li, Xiaofang
    Wang, Ruili
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2019, 79 : 13 - 23
  • [5] Speech enhancement with natural sounding residual noise based on connected time-frequency speech presence regions
    Sorensen, KV
    Andersen, SV
    [J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2005, 2005 (18) : 2954 - 2964
  • [6] Speech Enhancement with Natural Sounding Residual Noise Based on Connected Time-Frequency Speech Presence Regions
    Karsten Vandborg Sørensen
    Søren Vang Andersen
    [J]. EURASIP Journal on Advances in Signal Processing, 2005
  • [7] Speech Enhancement Method Based on Frequency-Time Dilated Dense Network
    Huang, Xiangdong
    Chen, Honghong
    Gan, Lin
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (07): : 1628 - 1638
  • [8] TIME-FREQUENCY ATTENTION FOR MONAURAL SPEECH ENHANCEMENT
    Zhang, Qiquan
    Song, Qi
    Ni, Zhaoheng
    Nicolson, Aaron
    Li, Haizhou
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7852 - 7856
  • [9] Neural speech enhancement in the time-frequency domain
    Volkmer, M
    [J]. 2003 IEEE XIII WORKSHOP ON NEURAL NETWORKS FOR SIGNAL PROCESSING - NNSP'03, 2003, : 617 - 626
  • [10] A two-stage frequency-time dilated dense network for speech enhancement
    Huang, Xiangdong
    Chen, Honghong
    Lu, Wei
    [J]. APPLIED ACOUSTICS, 2022, 201