Deep Residual-Dense Lattice Network for Speech Enhancement

被引:0
|
作者
Nikzad, Mohammad [1 ]
Nicolson, Aaron [1 ]
Gao, Yongsheng [1 ]
Zhou, Jun [1 ]
Paliwal, Kuldip K. [1 ]
Shang, Fanhua [2 ]
机构
[1] Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Qld, Australia
[2] Xidian Univ, Sch Artificial Intelligence, Xian, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional neural networks (CNNs) with residual links (ResNets) and causal dilated convolutional units have been the network of choice for deep learning approaches to speech enhancement. While residual links improve gradient flow during training, feature diminution of shallow layer outputs can occur due to repetitive summations with deeper layer outputs. One strategy to improve feature re-usage is to fuse both ResNets and densely connected CNNs (DenseNets). DenseNets, however, over-allocate parameters for feature re-usage. Motivated by this, we propose the residual-dense lattice network (RDL-Net), which is a new CNN for speech enhancement that employs both residual and dense aggregations without over-allocating parameters for feature re-usage. This is managed through the topology of the RDL blocks, which limit the number of outputs used for dense aggregations. Our extensive experimental investigation shows that RDL-Nets are able to achieve a higher speech enhancement performance than CNNs that employ residual and/or dense aggregations. RDL-Nets also use substantially fewer parameters and have a lower computational requirement. Furthermore, we demonstrate that RDL-Nets outperform many state-of-the-art deep learning approaches to speech enhancement.
引用
收藏
页码:8552 / 8559
页数:8
相关论文
共 50 条
  • [41] Enhancement of speech using deep neural network with discrete cosine transform
    Ram, Rashmirekha
    Mohanty, Mihir Narayan
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 35 (01) : 141 - 148
  • [42] Fractional feature-based speech enhancement with deep neural network
    Xu, Liyun
    Zhang, Tong
    [J]. SPEECH COMMUNICATION, 2023, 153
  • [43] Subjective intelligibility of deep neural network-based speech enhancement
    Gelderblom, Femke B.
    Tronstad, Tron V.
    Viggen, Erlend Magnus
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1968 - 1972
  • [44] Speech enhancement combining accurate ratio masking and deep neural network
    Bai, Haojun
    Zhang, Tianqi
    Liu, Jianxing
    Ye, Shaopeng
    [J]. Shengxue Xuebao/Acta Acustica, 2022, 47 (03): : 394 - 404
  • [45] A Perceptually Motivated Approach for Speech Enhancement Based on Deep Neural Network
    Han, Wei
    Zhang, Xiongwei
    Min, Gang
    Sun, Meng
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (04): : 835 - 838
  • [46] Deep Neural Network for Supervised Single-Channel Speech Enhancement
    Saleem, Nasir
    Irfan Khattak, Muhammad
    Ali, Muhammad Yousaf
    Shafi, Muhammad
    [J]. ARCHIVES OF ACOUSTICS, 2019, 44 (01) : 3 - 12
  • [47] A two-stage frequency-time dilated dense network for speech enhancement
    Huang, Xiangdong
    Chen, Honghong
    Lu, Wei
    [J]. APPLIED ACOUSTICS, 2022, 201
  • [48] DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement
    Lee, Dongheon
    Choi, Jung-Woo
    [J]. IEEE Signal Processing Letters, 2023, 30 : 155 - 159
  • [49] Progressive Speech Enhancement with Residual Connections
    Llombart, Jorge
    Ribas, Dayana
    Miguel, Antonio
    Vicente, Luis
    Ortega, Alfonso
    Lleida, Eduardo
    [J]. INTERSPEECH 2019, 2019, : 3193 - 3197
  • [50] Deep Residual Network in Network
    Alaeddine, Hmidi
    Jihene, Malek
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021