DENSELY CONNECTED MULTI-STAGE MODEL WITH CHANNEL WISE SUBBAND FEATURE FOR REAL-TIME SPEECH ENHANCEMENT

被引:7
|
作者
Li, Jingdong [1 ]
Luo, Dawei [1 ]
Liu, Yun [1 ]
Zhu, Yuanyuan [1 ]
Li, Zhaoxia [1 ]
Cui, Guohui [1 ]
Tang, Wenqi [1 ]
Chen, Wei [1 ]
机构
[1] Sogou Inc, AI Interact Div, Beijing, Peoples R China
关键词
speech enhancement; noise suppression; speech perceptual quality; supervised learning;
D O I
10.1109/ICASSP39728.2021.9413967
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Research on single channel speech enhancement (SE) has a long tradition, but two main practical problems still remain unsolved. Firstly, it's hard to balance between enhancement quality and computational efficiency, and low-latency always brings loss of quality. Secondly, enhancement in specific scenarios, such as singing and emotional speech, is also an intricate problem of conventional methods. In this paper, we propose a computationally efficient real-time speech enhancement network with densely connected multi-stage structures, which progressively enhances the channel-wise subband speech. The enhanced speech from earlier stage is used to guide the processing of deeper stage in order to obtain coarse to fine estimations. Besides, supervision is applied to all intermediate results in order to stabilize training and accelerate convergence. Moreover, an adaptive fine-tune step is utilized with some small datasets of specific scenarios, which achieves superb improvement under corresponding scenes. As a result, the proposed method achieves promising performance improvements in terms of speech quality and demonstrates robustness in complex scenarios. We submitt the proposed method to the deep noise suppression (DNS) challenge 2021, real-time denoising track, which was held by Microsoft. In the subjective evaluation, our system outperforms DNS-Challenge baseline by 0.14 points in terms of mean opinion score (MOS).
引用
收藏
页码:6638 / 6642
页数:5
相关论文
共 50 条
  • [41] Multi-stage slope displacement analysis based on real-time dynamic Newmark slider method
    Ye, Shuaihua
    Xue, Tao
    Zhang, Wuyu
    [J]. SOIL DYNAMICS AND EARTHQUAKE ENGINEERING, 2023, 174
  • [42] Real-time Feature Extraction for Multi-channel EEG Signals Time-Frequency Analysis
    Zhang, Lei
    [J]. 2017 8TH INTERNATIONAL IEEE/EMBS CONFERENCE ON NEURAL ENGINEERING (NER), 2017, : 493 - 496
  • [43] Cascaded feature enhancement network model for real-time video monitoring of power system
    Long, Xitian
    Zheng, Zhe
    Liu, Rui
    Cui, Wenpeng
    Chi, Yingying
    Zhang, Haifeng
    Yuan, Yidong
    [J]. ENERGY REPORTS, 2021, 7 : 8485 - 8492
  • [44] FULLSUBNET: A FULL-BAND AND SUB-BAND FUSION MODEL FOR REAL-TIME SINGLE-CHANNEL SPEECH ENHANCEMENT
    Hao, Xiang
    Su, Xiangdong
    Horaud, Radu
    Li, Xiaofei
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6633 - 6637
  • [45] A NEW SINGLE-ZONE MULTI-STAGE SCAVENGING MODEL FOR REAL-TIME EMISSIONS CONTROL IN TWO-STROKE ENGINES
    Bajwa, Abdullah U.
    Patterson, Mark
    Linker, Taylor
    Jacobs, Timothy J.
    [J]. PROCEEDINGS OF THE ASME INTERNAL COMBUSTION ENGINE FALL TECHNICAL CONFERENCE, 2019, 2020,
  • [46] COMBINING DEEP NEURAL NETWORKS AND BEAMFORMING FOR REAL-TIME MULTI-CHANNEL SPEECH ENHANCEMENT USING A WIRELESS ACOUSTIC SENSOR NETWORK
    Ceolini, Enea
    Liu, Shih-Chii
    [J]. 2019 IEEE 29TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2019,
  • [47] REAL-TIME SPEECH ENHANCEMENT FOR MOBILE COMMUNICATION BASED ON DUAL-CHANNEL COMPLEX SPECTRAL MAPPING
    Tan, Ke
    Zhang, Xueliang
    Wang, DeLiang
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6134 - 6138
  • [48] Real-time monitoring of induced strain during multi-stage ad-/desorption of methane on coal
    Hao, Min
    Li, Chengwu
    Wang, Yilin
    Zhang, Heng
    [J]. GEOMECHANICS AND GEOPHYSICS FOR GEO-ENERGY AND GEO-RESOURCES, 2022, 8 (06)
  • [49] Optimizing multi-stage CdZnTe Compton camera for real-time proton range determination in proton radiotherapy
    Stothers, Laurel
    Hou, Xinchi
    Tanguay, Jesse
    Celler, Anna
    [J]. JOURNAL OF NUCLEAR MEDICINE, 2016, 57
  • [50] Rolling-window Multi-stage Stochastic Programming for a Virtual Power Plant in the Real-time Market
    Luo, Zhe
    Guo, Ye
    Sun, Hongbin
    [J]. 2021 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2021,