A Novel Expectation-Maximization Framework for Speech Enhancement in Non-Stationary Noise Environments

被引:7
|
作者
Lun, Daniel P. K. [1 ]
Shen, Tak-Wai [1 ]
Ho, K. C. [2 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Ctr Signal Proc, Hong Kong, Hong Kong, Peoples R China
[2] Univ Missouri, Dept Elect & Comp Engn, Columbia, MO 65211 USA
关键词
Cepstral analysis; expectation-maximization; speech enhancement; IMAGE-RECONSTRUCTION; EM ALGORITHM; GAIN;
D O I
10.1109/TASLP.2013.2290497
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Voiced speeches have a quasi-periodic nature that allows them to be compactly represented in the cepstral domain. It is a distinctive feature compared with noises. Recently, the temporal cepstrum smoothing (TCS) algorithm was proposed and was shown to be effective for speech enhancement in non-stationary noise environments. However, the missing of an automatic parameter updating mechanism limits its adaptability to noisy speeches with abrupt changes in SNR across time frames or frequency components. In this paper, an improved speech enhancement algorithm based on a novel expectation-maximization (EM) framework is proposed. The new algorithm starts with the traditional TCS method which gives the initial guess of the periodogram of the clean speech. It is then applied to an norm regularizer in the M-step of the EM framework to estimate the true power spectrum of the original speech. It in turn enables the estimation of the a-priori SNR and is used in the E-step, which is indeed a logmmse gain function, to refine the estimation of the clean speech periodogram. The M-step and E-step iterate alternately until converged. A notable improvement of the proposed algorithm over the traditional TCS method is its adaptability to the changes (even abrupt changes) in SNR of the noisy speech. Performance of the proposed algorithm is evaluated using standard measures based on a large set of speech and noise signals. Evaluation results show that a significant improvement is achieved compared to conventional approaches especially in non-stationary noise environment where most conventional algorithms fail to perform.
引用
收藏
页码:335 / 346
页数:12
相关论文
共 50 条
  • [41] A HIGHLY NON-STATIONARY NOISE TRACKING AND COMPENSATION ALGORITHM, WITH APPLICATIONS TO SPEECH ENHANCEMENT AND ON-LINE ASR
    Chowdhury, Md Foezur Rahman
    Selouani, Sid-Ahmed
    O'Shaughnessy, Douglas
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4337 - 4340
  • [42] Multi notch adaptive digital filter design for enhancement of speech signals embedded in non-stationary noise
    Erçelebi, E
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2004, 30 (02) : 79 - 95
  • [43] An expectation-maximization framework for comprehensive prediction of isoform-specific functions
    Karlebach, Guy
    Carmody, Leigh
    Sundaramurthi, Jagadish Chandrabose
    Casiraghi, Elena
    Hansen, Peter
    Reese, Justin
    Mungall, Christopher J.
    Valentini, Giorgio
    Robinson, Peter N.
    [J]. BIOINFORMATICS, 2023, 39 (04)
  • [44] Regularizing CTC in Expectation-Maximization Framework with Application to Handwritten Text Recognition
    Gao, Likun
    Zhang, Heng
    Li, Cheng-Lin
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [45] MULTI-MICROPHONE SPEECH DEREVERBERATION USING EXPECTATION-MAXIMIZATION AND KALMAN SMOOTHING
    Schwartz, Boaz
    Gannot, Sharon
    Habets, Emanuel A. P.
    [J]. 2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2013,
  • [46] AN EXPECTATION-MAXIMIZATION ALGORITHM FOR MULTICHANNEL ADAPTIVE SPEECH DEREVERBERATION IN THE FREQUENCY-DOMAIN
    Schmid, Dominic
    Malik, Sarmad
    Enzner, Gerald
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 17 - 20
  • [47] Speech detection in non-stationary noise based on the 1/f process
    Wang, F
    Zheng, F
    Wu, WH
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2002, 17 (01): : 83 - 89
  • [48] Modelling non-stationary noise with spectral factorisation in automatic speech recognition
    Hurmalainen, Antti
    Gemmeke, Jort F.
    Virtanen, Tuomas
    [J]. COMPUTER SPEECH AND LANGUAGE, 2013, 27 (03): : 763 - 779
  • [49] Speech Estimation in Non-Stationary Noise Environments Using Timing Structures between Mouth Movements and Sound Signals
    Kawashima, Hiroaki
    Horii, Yu
    Matsuyama, Takashi
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 442 - 445
  • [50] Speech detection in non-stationary noise based on the 1/f process
    Fan Wang
    Fang Zheng
    Wenhu Wu
    [J]. Journal of Computer Science and Technology, 2002, 17 : 83 - 89