DCE-CDPPTnet: Dense Connected Encoder Cross Dual-path Parrel Transformer Network for Multi-channel Speech Separation

被引:0
|
作者
Zhuang, Chenghao [1 ]
Zhou, Lin [1 ]
Cao, Yanxiang [1 ]
Wang, Qirui [1 ]
Cheng, Yunling [1 ]
机构
[1] Southeast Univ, Sch Informat Sci & Engn, Nanjing, Peoples R China
关键词
speech separation; multi-channel; transformer;
D O I
10.1109/ICCCAS62034.2024.10652697
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, there has been increasing attention given to the end-to-end time domain model for speech separation task, which has demonstrated superior performance. Specifically, research in time domain speech separation has focused on two main areas: extracting more effective features and improving the modeling of temporal speech sequences. To tackle these challenges, we propose the dense connected encoder cross dual-path parallel transformer network (DCE-CDPPTnet) for multi-channel speech separation. By utilizing a dense connected encoder, our network is able to enhance feature extraction through multilayer convolutional layers organized by dense connected structure. Furthermore, this encoder also mitigates the issue of gradient disappearance. In order to better model long-time speech sequences, our proposed model incorporates the cross dual-path parallel transformer. This transformer utilizes both the intra improved transformer and the inter improved transformer to capture local and global information, respectively. Moreover, the CDPPTnet enables local information and global information can interact by parallelizing the intra improved transformer and the inter improved transformer. Simulation results under various model configurations, demonstrate the superior performance of the proposed DCE-CDPPTnet compared to the filter-and-sum network with transform-average-concatenate module (FaSNet-TAC).
引用
收藏
页码:303 / 308
页数:6
相关论文
共 7 条
  • [1] DE-DPCTnet: Deep Encoder Dual-path Convolutional Transformer Network for Multi-channel Speech Separation
    Wang, Zhenyu
    Zhou, Yi
    Gan, Lu
    Chen, Rilin
    Tang, Xinyu
    Liu, Hongqing
    2022 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2022, : 180 - 184
  • [2] Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments
    Wang, Chunxi
    Jia, Maoshen
    Zhang, Xinfeng
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
  • [3] Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments
    Chunxi Wang
    Maoshen Jia
    Xinfeng Zhang
    EURASIP Journal on Audio, Speech, and Music Processing, 2023
  • [4] Single Channel Speech Enhancement using a Complex Dual-Path Multi Axial Transformer with Frequency Prompt
    Jannu, Chaitanya
    Burra, Manaswini
    Vanambathina, Sunny Dayal
    Parisae, Veeraswamy
    Krishna, Chinta Venkata Murali
    Madhumati, G. L.
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2025,
  • [5] Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation
    Chen, Jingjing
    Mao, Qirong
    Liu, Dong
    INTERSPEECH 2020, 2020, : 2642 - 2646
  • [6] Fault diagnosis of the harmonic reducer based on dual-path convolutional network with multi-channel hybrid attention mechanism
    Li, Kai
    Yang, Ronggang
    Wei, Tianci
    Yang, Yiwen
    Xiang, Jiawei
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2025, 36 (01)
  • [7] Multi-Head Attention Time Domain Audiovisual Speech Separation Based on Dual-Path Recurrent Network and Conv-TasNet
    Lan C.
    Jiang P.
    Chen H.
    Zhao S.
    Guo X.
    Han Y.
    Han C.
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (03): : 1005 - 1012