Speech Enhancement Method with Geometric Phase Estimation By Incorporating MIXMAX Model

被引:1
|
作者
Wang, Xianyun [1 ]
Bao, Changchun [1 ]
机构
[1] Beijing Univ Technol, Sch Elect Informat & Control Engn, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/APSIPA.2016.7820908
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we propose a frequency-domain speech enhancement algorithm with phase estimation, in which the speech model is modeled by a Gaussian mixture model (GMM) in the log-spectral domain and two closed-form log-spectral amplitude estimators for speech and noise are derived directly by using a Mixture-Maximum (MIXMAX) model. Because the accurate estimation of speech phase could help to reduce the undesired noise residues in the enhanced signal, our two log-spectral estimators are also used to construct a geometric approach for phase estimation in each frequency bin. In order to solve the ambiguity problem in phase estimation, we utilize the complex linear predictive analysis (CLPA) and inconsistency constraint to find an appropriate phase. Experimental results show that, in comparison with the reference methods, the proposed method achieves an efficient improvement in speech quality.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Mask estimation incorporating phase-sensitive information for speech enhancement
    Wang, Xianyun
    Bao, Changchun
    [J]. APPLIED ACOUSTICS, 2019, 156 : 101 - 112
  • [2] Role of Phase Estimation in Speech Enhancement
    Shannon, Benjamin J.
    Paliwal, Kuldip K.
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1423 - 1426
  • [3] Improved a priori SNR estimation for speech enhancement incorporating speech distortion component
    [J]. Ou, S. (250800719@qq.com), 1600, Universitas Ahmad Dahlan, Jalan Kapas 9, Semaki, Umbul Harjo,, Yogiakarta, 55165, Indonesia (11):
  • [4] A Mask Estimation Method Integrating Data Field Model for Speech Enhancement
    Wang, Xianyun
    Bao, Changchun
    Bao, Feng
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1904 - 1908
  • [5] Incorporating a psychoacoustical model in frequency domain speech enhancement
    Hu, Y
    Loizou, PC
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2004, 11 (02) : 270 - 273
  • [6] Masking Estimation with Phase Restoration of Clean Speech for Monaural Speech Enhancement
    Wang, Xianyun
    Bao, Changchun
    [J]. INTERSPEECH 2019, 2019, : 3188 - 3192
  • [7] COMPLEX-VALUED GAUSSIAN PROCESS LATENT VARIABLE MODEL FOR PHASE-INCORPORATING SPEECH ENHANCEMENT
    Chen, Sih-Huei
    Lee, Yuan-Shan
    Wang, Jia-Ching
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5439 - 5443
  • [8] A Speech Enhancement Method by Coupling Speech Detection and Spectral Amplitude Estimation
    Deng, Feng
    Bao, Chang-Chun
    Bao, Feng
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3233 - 3237
  • [9] Phase Estimation in Single Channel Speech Enhancement Using Phase Decomposition
    Kulmer, Josef
    Mowlaee, Pejman
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (05) : 598 - 602
  • [10] Bayesian estimation for speech enhancement given a priori knowledge of clean speech phase
    Sunnydayal V.
    Kumar T.K.
    [J]. International Journal of Speech Technology, 2015, 18 (04) : 593 - 607