Multi-channel Speech Enhancement Based on the MVDR Beamformer and Postfilter

被引：0

作者：

Wang, Dujuan ^{[1
]}

Bao, Changchun ^{[1
]}

机构：

[1] Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2020) | 2020年

基金：

中国国家自然科学基金;

关键词：

beamforming; speech enhancement; residual neural network; real and imaginary masks; postfilter;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep neural network (DNN) based ideal ratio mask (IRM) estimation methods have yielded good performance in monaural speech enhancement. Meanwhile, these methods have also shown considerable potential for beamforming and multichannel speech enhancement. It is crucial for minimum variance distortionless response (MVDR) beamformer to estimate the covariance matrix of the speech and noise accurately. The accurate estimation of time-frequency (T-F) mask has significant impact on the estimation of the covariance matrices. So, in this paper, a complex real and imaginary ratio mask (CRIRM) based MVDR beamformer for speech enhancement using residual network is proposed. First, the real and imaginary masks of speech and noise are estimated by taking advantage of a residual neural network. After that, the estimations of speech and noise are obtained by using the estimated masks. Finally, the covariance matrices of speech and noise are estimated, and applied into the MVDR beamformer. In addition, in order to further reduce residual noise interference, the output of the MVDR beamformer is further processed by an end-to-end monaural speech enhancement module. Experiments show that, the proposed method can better improve the quality and intelligibility of the enhanced speech.

引用

页数：5

共 50 条

[41] Noise eigenvalue modification methods for spatial subspace based multi-channel speech enhancement
Kim, Gibak
Cho, Nam Ik
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 573 - +
[42] A Multi-Channel Multi-Bit Programmable Photonic Beamformer Based on Cascaded DWDM
Yu, Anliang
Zou, Weiwen
Li, Shuguang
Chen, Jianping
IEEE PHOTONICS JOURNAL, 2014, 6 (04):
[43] A multi-channel speech enhancement framework for robust NMF-based speech recognition for speech-impaired users
Dekkers, Gert
van Waterschoot, Toon
Vanrumste, Bart
Van Den Broeck, Bert
Gemmeke, Jort F.
Van Hamme, Hugo
Karsmakers, Peter
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 746 - 750
[44] A Pre-Separation and All-Neural Beamformer Framework for Multi-Channel Speech Separation
Xie, Wupeng
Xiang, Xiaoxiao
Zhang, Xiaojuan
Liu, Guanghong
SYMMETRY-BASEL, 2023, 15 (02):
[45] DESNET: A MULTI-CHANNEL NETWORK FOR SIMULTANEOUS SPEECH DEREVERBERATION, ENHANCEMENT AND SEPARATION
Fu, Yihui
Wu, Jian
Hu, Yanxin
Xing, Mengtao
Xie, Lei
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 857 - 864
[46] Speech enhancement by multi-channel crosstalk resistant adaptive noise cancellation
Zeng, Qingning
Abdulla, Waleed H.
2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 485 - 488
[47] A time-frequency fusion model for multi-channel speech enhancement
Zeng, Xiao
Xu, Shiyun
Wang, Mingjiang
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01):
[48] Construction of microphone arrays for the optimization of multi-channel speech enhancement systems
Drews, M
FREQUENZ, 1996, 50 (9-10) : 223 - 227
[49] Generalized postfilter for speech quality enhancement
Grancharov, Volodya
Plasberg, Jan H.
Samuelsson, Jonas
Kleijn, W. Bastiaan
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (01): : 57 - 64
[50] An enhanced beamsteering algorithm based on MVDR for a multi-channel parametric array loudspeaker array
Zhu, Yunxi
Zhang, Yankai
Fan, Fengyi
Ma, Wenyao
Qin, Liwen
Kuang, Zheng
Wu, Ming
Yang, Jun
JOURNAL OF SOUND AND VIBRATION, 2025, 595

← 1 2 3 4 5 →