Multiple statistical models for soft decision in noisy speech enhancement

被引:11
|
作者
Chang, Joon-Hyuk [1 ]
Gazor, Saeed
Kim, Nam Soo
Mitra, Sanjit K.
机构
[1] Inha Univ, Sch Elect Engn, Inchon 402751, South Korea
[2] Queens Univ, Dept Elect & Comp Engn, Kingston, ON K7L 3N6, Canada
[3] Seoul Natl Univ, Sch Elect Engn, Seoul 151742, South Korea
[4] Seoul Natl Univ, INMC, Seoul 151742, South Korea
[5] Univ Calif Santa Barbara, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA
关键词
speech enhancement; DCT; multiple statistical model; Gaussian; Laplacian; Gamma; GOF; PSFM; SAP; PESQ;
D O I
10.1016/j.patcog.2006.07.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most speech enhancement algorithms are based on the assumption that speech and noise are both Gaussian in the discrete cosine transform (DCT) domain. For further enhancement of noisy speech in the DCT domain, we consider multiple statistical distributions (i.e., Gaussian, Laplacian and Gamma) as a set of candidates to model the noise and speech. We first use the goodness-of-fit (GOF) test in order to measure how far the assumed model deviate from the actual distribution for each DCT component of noisy speech. Our evaluations illustrate that the best candidate is assigned to each frequency bin depending on the Signal-to-Noise-Ratio (SNR) and the Power Spectral Flatness Measure (PSFM). In particular, since the PSFM exhibits a strong relation with the best statistical fit we employ a simple recursive estimation of the PSFM in the model selection. The proposed speech enhancement algorithm employs a soft estimate of the speech absence probability (SAP) separately for each frequency bin according to the selected distribution. Both objective and subjective tests are performed for the evaluation of the proposed algorithms on a large speech database, for various SNR values and types of background noise. Our evaluations show that the proposed soft decision scheme based on multiple statistical modeling or the PSFM provides further speech quality enhancement compared with recent methods through a number of subjective and objective tests. (c) 2006 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:1123 / 1134
页数:12
相关论文
共 50 条
  • [1] Soft Decision Based Laplacian Model Factor Estimation for Noisy Speech Enhancement
    Ou, Shifeng
    Sun, Haidong
    Zhang, Yanqin
    Gao, Ying
    [J]. 2013 6TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP), VOLS 1-3, 2013, : 1324 - 1328
  • [2] Soft Decision Based Gaussian-Laplacian Combination Model for Noisy Speech Enhancement
    OU Shifeng
    SONG Peng
    GAO Ying
    [J]. Chinese Journal of Electronics, 2018, 27 (04) : 827 - 834
  • [3] Soft Decision Based Gaussian-Laplacian Combination Model for Noisy Speech Enhancement
    Ou Shifeng
    Song Peng
    Gao Ying
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2018, 27 (04) : 827 - 834
  • [4] Speech enhancement: New approaches to soft decision
    Chang, JH
    Kim, NS
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2001, E84D (09) : 1231 - 1240
  • [5] Speech recognition based on statistical models including multiple phonetic decision trees
    Shiota, Sayaka
    Hashimoto, Kei
    Zen, Heiga
    Nankaku, Yoshihiko
    Lee, Akinobu
    Tokuda, Keiichi
    [J]. Acoustical Science and Technology, 2011, 32 (06): : 236 - 243
  • [6] Speech recognition based on statistical models including multiple phonetic decision trees
    Shiota, Sayaka
    Hashimoto, Kei
    Zen, Heiga
    Nankaku, Yoshihiko
    Lee, Akinobu
    Tokuda, Keiichi
    [J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2011, 32 (06) : 236 - 243
  • [7] Joint Soft Threshold and Statistical Estimation for Speech Enhancement
    Van Khanh Mai
    Pastor, Dominique
    Aissa-El-Bey, Abdeldjalil
    [J]. 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL, IMAGE, VIDEO AND COMMUNICATIONS (ISIVC 2018), 2018, : 249 - 253
  • [8] Application of Perceptual Filtering Models to Noisy Speech Signals Enhancement
    Zoghlami, Novlene
    Lachiri, Zied
    [J]. JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2012, 2012
  • [9] Processing Noisy Speech for Enhancement
    Krishnamoorthy, P.
    Prasanna, Mahadeva
    [J]. IETE TECHNICAL REVIEW, 2007, 24 (05) : 351 - 357
  • [10] Automatic Speech Segmentation with Multiple Statistical Models
    Park, Seung Seop
    Shin, Jong Won
    Kim, Nam Soo
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2066 - +