DE-DPCTnet: Deep Encoder Dual-path Convolutional Transformer Network for Multi-channel Speech Separation

被引:0
|
作者
Wang, Zhenyu [1 ,2 ,4 ]
Zhou, Yi [1 ,2 ]
Gan, Lu [3 ,4 ]
Chen, Rilin
Tang, Xinyu [1 ,2 ]
Liu, Hongqing [1 ,2 ]
机构
[1] Chongqing Univ Posts & Telecommun, Chongqing 400065, Peoples R China
[2] Chongqing Key Lab Signal & Informat Proc, Chongqing 400065, Peoples R China
[3] Brunel Univ, Coll Engn Design & Phys Sci, London UB8 3PH, England
[4] Tencent AI Lab, Beijing, Peoples R China
关键词
Speech separation; multi-channel; deep encoder; improved transformer; beamforming; TASNET;
D O I
10.1109/SIPS55645.2022.9919247
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, beamforming has been extensively investigated in multi-channel speech separation task. In this paper, we propose a deep encoder dual-path convolutional transformer network (DE-DPCTnet), which directly estimates the beamforming filters for speech separation task in time domain. In order to learn the signal repetitions correctly, nonlinear deep encoder module is proposed to replace the traditional linear one. The improved transformer is also developed by utilizing convolutions to capture long-time speech sequences. The ablation studies demonstrate that the deep encoder and improved transformer indeed benefit the separation performance. The comparisons show that the DE-DPCTnet outperforms the state-of-the-art filter-and-sum network with transform-average-concatenate module (FaSNet-TAC), even with a lower computational complexity.
引用
收藏
页码:180 / 184
页数:5
相关论文
共 50 条
  • [1] DCE-CDPPTnet: Dense Connected Encoder Cross Dual-path Parrel Transformer Network for Multi-channel Speech Separation
    Zhuang, Chenghao
    Zhou, Lin
    Cao, Yanxiang
    Wang, Qirui
    Cheng, Yunling
    2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024, 2024, : 303 - 308
  • [2] Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments
    Wang, Chunxi
    Jia, Maoshen
    Zhang, Xinfeng
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
  • [3] Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments
    Chunxi Wang
    Maoshen Jia
    Xinfeng Zhang
    EURASIP Journal on Audio, Speech, and Music Processing, 2023
  • [4] Fault diagnosis of the harmonic reducer based on dual-path convolutional network with multi-channel hybrid attention mechanism
    Li, Kai
    Yang, Ronggang
    Wei, Tianci
    Yang, Yiwen
    Xiang, Jiawei
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2025, 36 (01)
  • [5] Dual-Path Hybrid Attention Network for Monaural Speech Separation
    Qiu, Wenbo
    Hu, Ying
    IEEE ACCESS, 2022, 10 : 78754 - 78763
  • [6] DEEP COMPLEX CONVOLUTIONAL RECURRENT NETWORK FOR MULTI-CHANNEL SPEECH ENHANCEMENT AND DEREVERBERATION
    Gelderblom, Femke B.
    Myrvoll, Tor Andre
    2021 IEEE 31ST INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2021,
  • [7] Single Channel Speech Enhancement using a Complex Dual-Path Multi Axial Transformer with Frequency Prompt
    Jannu, Chaitanya
    Burra, Manaswini
    Vanambathina, Sunny Dayal
    Parisae, Veeraswamy
    Krishna, Chinta Venkata Murali
    Madhumati, G. L.
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2025,
  • [8] SPLIT-ATTENTION MECHANISMS WITH GRAPH CONVOLUTIONAL NETWORK FOR MULTI-CHANNEL SPEECH SEPARATION
    Tan, YingWei
    Ding, XueFeng
    2024 18TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT, IWAENC 2024, 2024, : 140 - 144
  • [9] Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation
    Yang, Xue
    Bao, Changchun
    INTERSPEECH 2022, 2022, : 5338 - 5342
  • [10] PhaseDCN: A Phase-Enhanced Dual-Path Dilated Convolutional Network for Single-Channel Speech Enhancement
    Zhang, Lu
    Wang, Mingjiang
    Zhang, Qiquan
    Wang, Xinsheng
    Liu, Ming
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2561 - 2574