Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Full-Band Speech Enhancement

被引：3

作者：

Yu, Guochen ^{[1
,2
]}

Li, Andong ^{[2
]}

Liu, Wenzhe ^{[2
]}

Zheng, Chengshi ^{[2
]}

Wang, Yutian ^{[1
]}

Wang, Hui ^{[1
]}

机构：

[1] Commun Univ China, State Key Lab Media Convergence & Commun, Beijing, Peoples R China

[2] Chinese Acad Sci, Inst Acoust, Beijing, Peoples R China

来源：

2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2022年

关键词：

full-band speech enhancement; sub-bands fusion; dual-stream; decoupling-style concept; multi-stage; NETWORKS;

D O I：

10.1109/ISCSLP57327.2022.10037937

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Due to the high computational complexity to model more frequency bands, it is still intractable to conduct full-band speech enhancement based on deep neural networks. Recent studies typically utilize the compressed perceptually motivated features with relatively low frequency resolution to filter the full-band spectrum by one-stage networks, leading to limited speech quality improvements. In this paper, we propose a coordinated sub-band fusion network for full-band speech enhancement, which aims to recover the low- (0-8 kHz), middle- (8-16 kHz), and high-band (16-24 kHz) in a step-wise manner. Specifically, a dual-stream network is first pretrained to recover the low-band complex spectrum, and another two sub-networks are designed as the middle- and high-band noise suppressors in the magnitude-only domain. To fully capitalize on the information intercommunication, we employ a sub-band interaction module to provide external knowledge guidance across different frequency bands. Extensive experiments show that the proposed method yields consistent performance advantages over state-of-the-art full-band baselines.

引用

页码：483 / 487

页数：5

共 50 条

[1] Lightweight Full-band and Sub-band Fusion Network for Real Time Speech Enhancement
Chen, Zhuangqi
Zhang, Pingjian
[J]. INTERSPEECH 2022, 2022, : 921 - 925
[2] FULLSUBNET: A FULL-BAND AND SUB-BAND FUSION MODEL FOR REAL-TIME SINGLE-CHANNEL SPEECH ENHANCEMENT
Hao, Xiang
Su, Xiangdong
Horaud, Radu
Li, Xiaofei
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6633 - 6637
[3] DETECTING ADHD FROM SPEECH USING FULL-BAND AND SUB-BAND CONVOLUTION FUSION NETWORK
Li, Shuanglin
Nair, Rajesh
Naqvi, Syed Mohsen
[J]. 2023 IEEE SENSORS, 2023,
[4] DPT-FSNET: DUAL-PATH TRANSFORMER BASED FULL-BAND AND SUB-BAND FUSION NETWORK FOR SPEECH ENHANCEMENT
Dang, Feng
Chen, Hangting
Zhangt, Pengyuan
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6857 - 6861
[5] ADAPTIVE-FSN: INTEGRATING FULL-BAND EXTRACTION AND ADAPTIVE SUB-BAND ENCODING FOR MONAURAL SPEECH ENHANCEMENT
Tsao, Yu-Sheng
Ho, Kuan-Hsun
Hung, Jeih-Weih
Chen, Berlin
[J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 458 - 464
[6] FSI-Net: A dual-stage full- and sub-band integration network for full-band speech enhancement
Yu, Guochen
Wang, Hui
Li, Andong
Liu, Wenzhe
Zhang, Yuan
Wang, Yutian
Zheng, Chengshi
[J]. APPLIED ACOUSTICS, 2023, 211
[7] TS-CGANet: A Two-Stage Complex and Real Dual-Path Sub-Band Fusion Network for Full-Band Speech Enhancement
Chen, Haozhe
Zhang, Xiaojuan
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (07):
[8] DMF-Net: A decoupling-style multi-band fusion model for full-band speech enhancement
Yu, Guochen
Guan, Yuansheng
Meng, Weixin
Zheng, Chengshi
Wang, Hui
Wang, Yutian
[J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1382 - 1387
[9] Local spectral attention for full-band speech enhancement
Hou, Zhongshu
Hu, Qinwen
Chen, Kai
Cao, Zhanzhong
Lu, Jing
[J]. JASA EXPRESS LETTERS, 2023, 3 (11):
[10] Sub-band adaptive speech enhancement for hearing aids
Campbell, DR
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 180 - 183

← 1 2 3 4 5 →