Multi-channel Speech Enhancement Based on the MVDR Beamformer and Postfilter

被引:0
|
作者
Wang, Dujuan [1 ]
Bao, Changchun [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
beamforming; speech enhancement; residual neural network; real and imaginary masks; postfilter;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural network (DNN) based ideal ratio mask (IRM) estimation methods have yielded good performance in monaural speech enhancement. Meanwhile, these methods have also shown considerable potential for beamforming and multichannel speech enhancement. It is crucial for minimum variance distortionless response (MVDR) beamformer to estimate the covariance matrix of the speech and noise accurately. The accurate estimation of time-frequency (T-F) mask has significant impact on the estimation of the covariance matrices. So, in this paper, a complex real and imaginary ratio mask (CRIRM) based MVDR beamformer for speech enhancement using residual network is proposed. First, the real and imaginary masks of speech and noise are estimated by taking advantage of a residual neural network. After that, the estimations of speech and noise are obtained by using the estimated masks. Finally, the covariance matrices of speech and noise are estimated, and applied into the MVDR beamformer. In addition, in order to further reduce residual noise interference, the output of the MVDR beamformer is further processed by an end-to-end monaural speech enhancement module. Experiments show that, the proposed method can better improve the quality and intelligibility of the enhanced speech.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Signed Convex Combination of Fast Convergence Algorithm to Generalized Sidelobe Canceller Beamformer for Multi-Channel Speech Enhancement
    Priyanka, Siva S.
    Kumar, Kishore T.
    [J]. TRAITEMENT DU SIGNAL, 2021, 38 (03) : 785 - 795
  • [22] Single-channel Speech Enhancement Student under Multi-channel Speech Enhancement Teacher
    Zhang, Yuzhu
    Zhang, Hui
    Zhang, Xueliang
    [J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 372 - 377
  • [23] Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement
    Taherian, Hassan
    Wang, Zhong-Qiu
    Chang, Jorge
    Wang, DeLiang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1293 - 1302
  • [24] PSEUDO-COHERENCE-BASED MVDR BEAMFORMER FOR SPEECH ENHANCEMENT WITH AD HOC MICROPHONE ARRAYS
    Tavakoli, Vincent Mohammad
    Jensen, Jesper Rindom
    Christensen, Mads Graesboll
    Benesty, Jacob
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2659 - 2663
  • [25] A Feature Integration Network for Multi-Channel Speech Enhancement
    Zeng, Xiao
    Zhang, Xue
    Wang, Mingjiang
    [J]. Sensors, 2024, 24 (22)
  • [26] All-Neural Multi-Channel Speech Enhancement
    Wang, Zhong-Qiu
    Wang, DeLiang
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3234 - 3238
  • [27] A Joint-Loss Approach for Speech Enhancement via Single-channel Neural Network and MVDR Beamformer
    Tan, Zhi-Wei
    Nguyen, Anh H. T.
    Tran, Linh T. T.
    Khong, Andy W. H.
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 841 - 849
  • [28] TaylorBeamformer: Learning All-Neural Beamformer for Multi-Channel Speech Enhancement from Taylor's Approximation Theory
    Li, Andong
    Yu, Guochen
    Zheng, Chengshi
    Li, Xiaodong
    [J]. INTERSPEECH 2022, 2022, : 5413 - 5417
  • [29] A Novel Approach to Multi-Channel Speech Enhancement Based on Graph Neural Networks
    Chau, Hoang Ngoc
    Bui, Tien Dat
    Nguyen, Huu Binh
    Duong, Thanh Thi Hien
    Nguyen, Quoc Cuong
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1133 - 1144
  • [30] EXPLORING MULTI-CHANNEL FEATURES FOR DENOISING-AUTOENCODER-BASED SPEECH ENHANCEMENT
    Araki, Shoko
    Hayashi, Tomoki
    Delcroix, Marc
    Fujimoto, Masakiyo
    Takeda, Kazuya
    Nakatani, Tomohiro
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 116 - 120