DCCRN plus : Channel-wise Subband DCCRN with SNR Estimation for Speech Enhancement

被引：32

作者：

Lv, Shubo ^{[1
]}

Hu, Yanxin ^{[1
]}

Zhang, Shimin ^{[1
]}

Xie, Lei ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Sch Comp Sci, Audio Speech & Language Proc Grp ASLP NPU, Xian, Peoples R China

来源：

INTERSPEECH 2021 | 2021年

关键词：

speech enhancement; sub-band processing; deep complex convolution recurrent network; NEURAL-NETWORK;

D O I：

10.21437/Interspeech.2021-1482

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

Deep complex convolution recurrent network (DCCRN), which extends CRN with complex structure, has achieved superior performance in MOS evaluation in Interspeech 2020 deep noise suppression challenge (DNS2020). This paper further extends DCCRN with the following significant revisions. We first extend the model to sub-band processing where the bands are split and merged by learnable neural network filters instead of engineered FIR filters, leading to a faster noise suppressor trained in an end-to-end manner. Then the LSTM is further substituted with a complex TF-LSTM to better model temporal dependencies along both time and frequency axes. Moreover, instead of simply concatenating the output of each encoder layer to the input of the corresponding decoder layer, we use convolution blocks to first aggregate essential information from the encoder output before feeding it to the decoder layers. We specifically formulate the decoder with an extra a priori SNR estimation module to maintain good speech quality while removing noise. Finally a post-processing module is adopted to further suppress the unnatural residual noise. The new model, named DCCRN+, has surpassed the original DCCRN as well as several competitive models in terms of PESQ and DNSMOS, and has achieved superior performance in the new Interspeech 2021 DNS challenge.

引用

页码：2816 / 2820

页数：5

共 50 条

[1] S-DCCRN: SUPER WIDE BAND DCCRN WITH LEARNABLE COMPLEX FEATURE FOR SPEECH ENHANCEMENT
Lv, Shubo
Fu, Yihui
Xing, Mengtao
Sun, Jiayao
Xie, Lei
Huang, Jun
Wang, Yannan
Yu, Tao
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7767 - 7771
[2] SPATIAL-DCCRN: DCCRN EQUIPPED WITH FRAME-LEVEL ANGLE FEATURE AND HYBRID FILTERING FOR MULTI-CHANNEL SPEECH ENHANCEMENT
Lv, Shubo
Fu, Yihui
Jv, Yukai
Xie, Lei
Zhu, Weixin
Rao, Wei
Wang, Yannan
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 436 - 443
[3] Distil-DCCRN: A Small-Footprint DCCRN Leveraging Feature-Based Knowledge Distillation in Speech Enhancement
Han, Runduo
Xu, Weiming
Zhang, Zihan
Liu, Mingshuai
Xie, Lei
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2075 - 2079
[4] DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement
Hu, Yanxin
Liu, Yun
Lv, Shubo
Xing, Mengtao
Zhang, Shimin
Fu, Yihui
Wu, Jian
Zhang, Bihong
Xie, Lei
INTERSPEECH 2020, 2020, : 2472 - 2476
[5] Causal Signal-Based DCCRN with Overlapped-Frame Prediction for Online Speech Enhancement
Bartolewska, Julitta
Kacprzak, Stanislaw
Kowalczyk, Konrad
INTERSPEECH 2023, 2023, : 4039 - 4043
[6] Underwater image enhancement via a channel-wise transmission estimation network
Wang, Qiang
Fu, Bo
Fan, Huijie
IET IMAGE PROCESSING, 2023, 17 (10) : 2958 - 2971
[7] A novel skip connection mechanism based on channel-wise cross transformer for speech enhancement
Jiang, Weiqi
Sun, Chengli
Chen, Feilong
Leng, Yan
Guo, Qiaosheng
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (12) : 34849 - 34866
[8] A novel skip connection mechanism based on channel-wise cross transformer for speech enhancement
Weiqi Jiang
Chengli Sun
Feilong Chen
Yan Leng
Qiaosheng Guo
Multimedia Tools and Applications, 2024, 83 : 34849 - 34866
[9] Channel-wise Subband Input for Better Voice and Accompaniment Separation on High Resolution Music
Liu, Haohe
Xie, Lei
Wu, Jian
Yang, Geng
INTERSPEECH 2020, 2020, : 1241 - 1245
[10] A priori SNR estimation and noise estimation for speech enhancement
Yao, Rui
Zeng, ZeQing
Zhu, Ping
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2016,

← 1 2 3 4 5 →