Speech enhancement by Bayesian estimation of clean speech modeled as super Gaussian given a priori knowledge of phase

被引:2
|
作者
Vanambathina, Sunnydayal [1 ]
Kumar, T. Kishore [1 ]
机构
[1] Natl Inst Technol Warangal, Dept Elect & Commun Engn, Warangal 506004, Telangana, India
关键词
ML estimator; MAP estimator; MMSE estimator; Laplace density; von Mises distribution; Nakagami distribution; SPECTRAL MAGNITUDE ESTIMATION; SQUARE ERROR ESTIMATION; NOISE; COEFFICIENTS;
D O I
10.1016/j.specom.2015.11.004
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, SIP 1 based speech enhancement algorithms based on estimation of short time spectral amplitudes are proposed. These algorithms use Maximum Likelihood (ML), Maximum a posterior (MAP) and Minimum mean square error (MMSE) estimators which respectively uses Laplace, Gaussian probability density functions (pdf) as noise spectral amplitude priors and Nakagami, Gamma distributions as speech spectral amplitude priors. The method uses a joint MMSE estimate of the clean speech amplitude and clean speech phase for a given uncertainty phase information for improved single channel speech enhancement. In the most of the speech enhancement algorithms, we only concentrate on the frequency domain amplitude of speech, but not on the phase of noisy speech since it may cause undesired artifacts. In this paper, a recent phase reconstruction algorithm is used to estimate the phase of clean speech. The reconstructed phase is treated as an uncertain prior knowledge when deriving a joint MMSE estimate of the Complex speech coefficients given Uncertain Phase (CUP) information. The proposed MMSE optimal CUP estimator reduces undesired artifacts and also gives satisfactory values between the phase of noisy signal and the estimate of prior phase. We evaluate all the above estimators using speech signals uttered by 10 male speakers and 10 female speakers are taken from TIMIT database. The proposed method outperforms other benchmark algorithms in terms of segmental signal to noise ratio (SSNR), short-time objective intelligibility (STOI) and perceptual evaluation of speech quality (PESO). (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:8 / 27
页数:20
相关论文
共 50 条
  • [1] Bayesian estimation for speech enhancement given a priori knowledge of clean speech phase
    Sunnydayal V.
    Kumar T.K.
    International Journal of Speech Technology, 2015, 18 (04) : 593 - 607
  • [2] Bayesian Estimation of Clean Speech Spectral Coefficients Given a Priori Knowledge of the Phase
    Gerkmann, Timo
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2014, 62 (16) : 4199 - 4208
  • [3] Speech enhancement using super-Gaussian speech models and noncausal a priori SNR estimation
    Cohen, I
    SPEECH COMMUNICATION, 2005, 47 (03) : 336 - 350
  • [4] Masking Estimation with Phase Restoration of Clean Speech for Monaural Speech Enhancement
    Wang, Xianyun
    Bao, Changchun
    INTERSPEECH 2019, 2019, : 3188 - 3192
  • [5] Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model
    Thomas Lotter
    Peter Vary
    EURASIP Journal on Advances in Signal Processing, 2005
  • [6] Speech enhancement by MAP spectral amplitude estimation using a super-Gaussian speech model
    Lotter, T
    Vary, P
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2005, 2005 (07) : 1110 - 1126
  • [7] A priori SNR estimation and noise estimation for speech enhancement
    Yao, Rui
    Zeng, ZeQing
    Zhu, Ping
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2016,
  • [8] A priori SNR estimation and noise estimation for speech enhancement
    Rui Yao
    ZeQing Zeng
    Ping Zhu
    EURASIP Journal on Advances in Signal Processing, 2016
  • [9] IMPROVED A PRIORI SNR ESTIMATION IN SPEECH ENHANCEMENT
    Nahma, Lara
    Yong, Pei Chee
    Dam, Hai Huyen
    Nordholm, Sven
    2017 23RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS (APCC): BRIDGING THE METROPOLITAN AND THE REMOTE, 2017, : 253 - 257
  • [10] An improved estimation of a priori speech absence probability for speech enhancement: In perspective of speech perception
    Choi, MS
    Kang, HG
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 1117 - 1120