DCE-CDPPTnet: Dense Connected Encoder Cross Dual-path Parrel Transformer Network for Multi-channel Speech Separation

被引：0

作者：

Zhuang, Chenghao ^{[1
]}

Zhou, Lin ^{[1
]}

Cao, Yanxiang ^{[1
]}

Wang, Qirui ^{[1
]}

Cheng, Yunling ^{[1
]}

机构：

[1] Southeast Univ, Sch Informat Sci & Engn, Nanjing, Peoples R China

来源：

2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024 | 2024年

关键词：

speech separation; multi-channel; transformer;

D O I：

10.1109/ICCCAS62034.2024.10652697

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In recent years, there has been increasing attention given to the end-to-end time domain model for speech separation task, which has demonstrated superior performance. Specifically, research in time domain speech separation has focused on two main areas: extracting more effective features and improving the modeling of temporal speech sequences. To tackle these challenges, we propose the dense connected encoder cross dual-path parallel transformer network (DCE-CDPPTnet) for multi-channel speech separation. By utilizing a dense connected encoder, our network is able to enhance feature extraction through multilayer convolutional layers organized by dense connected structure. Furthermore, this encoder also mitigates the issue of gradient disappearance. In order to better model long-time speech sequences, our proposed model incorporates the cross dual-path parallel transformer. This transformer utilizes both the intra improved transformer and the inter improved transformer to capture local and global information, respectively. Moreover, the CDPPTnet enables local information and global information can interact by parallelizing the intra improved transformer and the inter improved transformer. Simulation results under various model configurations, demonstrate the superior performance of the proposed DCE-CDPPTnet compared to the filter-and-sum network with transform-average-concatenate module (FaSNet-TAC).

引用

页码：303 / 308

页数：6

共 7 条

[1] DE-DPCTnet: Deep Encoder Dual-path Convolutional Transformer Network for Multi-channel Speech Separation
Wang, Zhenyu
Zhou, Yi
Gan, Lu
Chen, Rilin
Tang, Xinyu
Liu, Hongqing
2022 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2022, : 180 - 184
[2] Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments
Wang, Chunxi
Jia, Maoshen
Zhang, Xinfeng
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
[3] Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments
Chunxi Wang
Maoshen Jia
Xinfeng Zhang
EURASIP Journal on Audio, Speech, and Music Processing, 2023
[4] Single Channel Speech Enhancement using a Complex Dual-Path Multi Axial Transformer with Frequency Prompt
Jannu, Chaitanya
Burra, Manaswini
Vanambathina, Sunny Dayal
Parisae, Veeraswamy
Krishna, Chinta Venkata Murali
Madhumati, G. L.
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2025,
[5] Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation
Chen, Jingjing
Mao, Qirong
Liu, Dong
INTERSPEECH 2020, 2020, : 2642 - 2646
[6] Fault diagnosis of the harmonic reducer based on dual-path convolutional network with multi-channel hybrid attention mechanism
Li, Kai
Yang, Ronggang
Wei, Tianci
Yang, Yiwen
Xiang, Jiawei
MEASUREMENT SCIENCE AND TECHNOLOGY, 2025, 36 (01)
[7] Multi-Head Attention Time Domain Audiovisual Speech Separation Based on Dual-Path Recurrent Network and Conv-TasNet
Lan C.
Jiang P.
Chen H.
Zhao S.
Guo X.
Han Y.
Han C.
Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (03): : 1005 - 1012

← 1 →