A Novel Expectation-Maximization Framework for Speech Enhancement in Non-Stationary Noise Environments

被引:7
|
作者
Lun, Daniel P. K. [1 ]
Shen, Tak-Wai [1 ]
Ho, K. C. [2 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Ctr Signal Proc, Hong Kong, Hong Kong, Peoples R China
[2] Univ Missouri, Dept Elect & Comp Engn, Columbia, MO 65211 USA
关键词
Cepstral analysis; expectation-maximization; speech enhancement; IMAGE-RECONSTRUCTION; EM ALGORITHM; GAIN;
D O I
10.1109/TASLP.2013.2290497
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Voiced speeches have a quasi-periodic nature that allows them to be compactly represented in the cepstral domain. It is a distinctive feature compared with noises. Recently, the temporal cepstrum smoothing (TCS) algorithm was proposed and was shown to be effective for speech enhancement in non-stationary noise environments. However, the missing of an automatic parameter updating mechanism limits its adaptability to noisy speeches with abrupt changes in SNR across time frames or frequency components. In this paper, an improved speech enhancement algorithm based on a novel expectation-maximization (EM) framework is proposed. The new algorithm starts with the traditional TCS method which gives the initial guess of the periodogram of the clean speech. It is then applied to an norm regularizer in the M-step of the EM framework to estimate the true power spectrum of the original speech. It in turn enables the estimation of the a-priori SNR and is used in the E-step, which is indeed a logmmse gain function, to refine the estimation of the clean speech periodogram. The M-step and E-step iterate alternately until converged. A notable improvement of the proposed algorithm over the traditional TCS method is its adaptability to the changes (even abrupt changes) in SNR of the noisy speech. Performance of the proposed algorithm is evaluated using standard measures based on a large set of speech and noise signals. Evaluation results show that a significant improvement is achieved compared to conventional approaches especially in non-stationary noise environment where most conventional algorithms fail to perform.
引用
收藏
页码:335 / 346
页数:12
相关论文
共 50 条
  • [1] Speech enhancement for non-stationary noise environments
    Cohen, I
    Berdugo, B
    [J]. SIGNAL PROCESSING, 2001, 81 (11) : 2403 - 2418
  • [2] Improved Expectation-Maximization Framework for Speech Enhancement Based on Iterative Noise Estimation
    Li, Tingtian
    Lun, Daniel P. K.
    Shen, Tak-Wai
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2015, : 287 - 291
  • [3] Anatomical guided segmentation with non-stationary tissue class distributions in an expectation-maximization framework
    Pohl, KM
    Bouix, S
    Kikinis, R
    Grimson, WEL
    [J]. 2004 2ND IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: MACRO TO NANO, VOLS 1 AND 2, 2004, : 81 - 84
  • [4] Single Channel Speech Enhancement for Mixed Non-stationary Noise Environments
    Singh, Sachin
    Tripathy, Manoj
    Anand, R. S.
    [J]. ADVANCES IN SIGNAL PROCESSING AND INTELLIGENT RECOGNITION SYSTEMS, 2014, 264 : 545 - 555
  • [5] Sparse Hidden Markov Models for Speech Enhancement in Non-Stationary Noise Environments
    Deng, Feng
    Bao, Changchun
    Kleijn, W. Bastiaan
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (11) : 1973 - 1987
  • [6] Robust Speech Enhancement Techniques for ASR in Non-stationary Noise and Dynamic Environments
    Liu, Gang
    Dimitriadis, Dimitrios
    Bocchieri, Enrico
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3016 - 3020
  • [7] SPARSE HMM-BASED SPEECH ENHANCEMENT METHOD FOR STATIONARY AND NON-STATIONARY NOISE ENVIRONMENTS
    Deng, Feng
    Bao, Chang-chun
    Kleijn, W. Bastiaan
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5073 - 5077
  • [8] Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments
    Malah, D
    Cox, RV
    Accardi, AJ
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 789 - 792
  • [9] Speech Enhancement by Online Non-negative Spectrogram Decomposition in Non-stationary Noise Environments
    Duan, Zhiyao
    Mysore, Gautham J.
    Smaragdis, Paris
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 594 - 597
  • [10] Enhancement and Noise Statistics Estimation for Non-Stationary Voiced Speech
    Norholm, Sidsel Marie
    Jensen, Jesper Rindom
    Christensen, Mads Grsboll
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (04) : 645 - 658