Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Full-Band Speech Enhancement

被引:3
|
作者
Yu, Guochen [1 ,2 ]
Li, Andong [2 ]
Liu, Wenzhe [2 ]
Zheng, Chengshi [2 ]
Wang, Yutian [1 ]
Wang, Hui [1 ]
机构
[1] Commun Univ China, State Key Lab Media Convergence & Commun, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Acoust, Beijing, Peoples R China
关键词
full-band speech enhancement; sub-bands fusion; dual-stream; decoupling-style concept; multi-stage; NETWORKS;
D O I
10.1109/ISCSLP57327.2022.10037937
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the high computational complexity to model more frequency bands, it is still intractable to conduct full-band speech enhancement based on deep neural networks. Recent studies typically utilize the compressed perceptually motivated features with relatively low frequency resolution to filter the full-band spectrum by one-stage networks, leading to limited speech quality improvements. In this paper, we propose a coordinated sub-band fusion network for full-band speech enhancement, which aims to recover the low- (0-8 kHz), middle- (8-16 kHz), and high-band (16-24 kHz) in a step-wise manner. Specifically, a dual-stream network is first pretrained to recover the low-band complex spectrum, and another two sub-networks are designed as the middle- and high-band noise suppressors in the magnitude-only domain. To fully capitalize on the information intercommunication, we employ a sub-band interaction module to provide external knowledge guidance across different frequency bands. Extensive experiments show that the proposed method yields consistent performance advantages over state-of-the-art full-band baselines.
引用
收藏
页码:483 / 487
页数:5
相关论文
共 50 条
  • [41] A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement
    Valin, Jean-Marc
    [J]. 2018 IEEE 20TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2018,
  • [42] ROBUST FULL-BAND ADAPTIVE SINUSOIDAL ANALYSIS AND SYNTHESIS OF SPEECH
    Kafentzis, George P.
    Rosec, Olivier
    Stylianou, Yannis
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [43] Independent sub-band functions: Model and applications
    Cheng, Xiefeng
    Zheng, Yan
    Tao, Yewei
    Chen, Zhengyu
    Chen, Yuehui
    [J]. 2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 361 - +
  • [44] Reconstruction of missing speech frames using sub-band excitation
    Cluver, K
    Noll, P
    [J]. PROCEEDINGS OF THE IEEE-SP INTERNATIONAL SYMPOSIUM ON TIME-FREQUENCY AND TIME-SCALE ANALYSIS, 1996, : 277 - 280
  • [45] A Replay Speech Detection Algorithm Based on Sub-band Analysis
    Lang Lin
    Wang, Rangding
    Yan Diqun
    [J]. INTELLIGENT INFORMATION PROCESSING IX, 2018, 538 : 337 - 345
  • [46] Sub-band level Histogram Equalization for Robust Speech Recognition
    Joshi, Vikas
    Bilgi, Raghavendra
    Umesh, S.
    Garcia, L.
    Benitez, C.
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1672 - +
  • [47] Maximum likelihood sub-band adaptation for robust speech recognition
    Zhu, DL
    Nakamura, S
    Paliwal, KK
    Wang, RH
    [J]. SPEECH COMMUNICATION, 2005, 47 (03) : 243 - 264
  • [48] Sub-band Modulation Spectrum Compensation for Robust Speech Recognition
    Tu, Wen-hsiang
    Huang, Sheng-Yuan
    Hung, Jeih-weih
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 261 - 265
  • [49] A Hybrid Text-to-Speech Based on Sub-Band Approach
    Inoue, Takuma
    Hara, Sunao
    Abe, Masanobu
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [50] Mel Sub-Band Filtering and Compression for Robust Speech Recognition
    Nasersharif, Babak
    Akbari, Ahmad
    Homayounpour, Mohammad Mehdi
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 105 - +