Improved speech absence probability estimation based on environmental noise classification

被引:0
|
作者
Young-ho Son
Sang-min Lee
机构
[1] Inha University,Department of Electronic Engineering
[2] Inha University,Institute for Information and Electronics Research
来源
关键词
speech enhancement; soft decision; speech absence probability; Gaussian mixture model (GMM);
D O I
暂无
中图分类号
学科分类号
摘要
An improved speech absence probability estimation was proposed using environmental noise classification for speech enhancement. A relevant noise estimation approach, known as the speech presence uncertainty tracking method, requires seeking the “a priori” probability of speech absence that is derived by applying microphone input signal and the noise signal based on the estimated value of the “a posteriori” signal-to-noise ratio (SNR). To overcome this problem, first, the optimal values in terms of the perceived speech quality of a variety of noise types are derived. Second, the estimated optimal values are assigned according to the determined noise type which is classified by a real-time noise classification algorithm based on the Gaussian mixture model (GMM). The proposed algorithm estimates the speech absence probability using a noise classification algorithm which is based on GMM to apply the optimal parameter of each noise type, unlike the conventional approach which uses a fixed threshold and smoothing parameter. The performance of the proposed method was evaluated by objective tests, such as the perceptual evaluation of speech quality (PESQ) and composite measure. Performance was then evaluated by a subjective test, namely, mean opinion scores (MOS) under various noise environments. The proposed method show better results than existing methods.
引用
收藏
页码:2548 / 2553
页数:5
相关论文
共 50 条
  • [21] A speech enhancement approach based on noise classification
    Yuan, Wenhao
    Xia, Bin
    APPLIED ACOUSTICS, 2015, 96 : 11 - 19
  • [22] Improved signal-to-noise ratio estimation for speech enhancement
    Plapous, Cyril
    Marro, Claude
    Scalart, Pascal
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (06): : 2098 - 2108
  • [23] Speech Enhancement Based on Teacher-Student Deep Learning Using Improved Speech Presence Probability for Noise-Robust Speech Recognition
    Tu, Yan-Hui
    Du, Jun
    Lee, Chin-Hui
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2080 - 2091
  • [24] Environmental sniffing: Noise knowledge estimation for robust speech systems
    Akbacak, Murat
    Hansen, John H. L.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (02): : 465 - 477
  • [25] Environmental sniffing: Noise knowledge estimation for robust speech systems
    Akbacak, M
    Hansen, JHL
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 113 - 116
  • [26] Speech enhancement based on modified Mel masking model and speech absence probability in whispers
    Tao, Zhi
    Zhao, Heming
    Wu, Di
    Chen, Daqing
    Zhang, Xiaojun
    Shengxue Xuebao/Acta Acustica, 2009, 34 (04): : 370 - 377
  • [27] An Improved Algorithm Based on Noise Estimation
    Li, Qiang
    Zheng, Qiu-ju
    2013 3RD INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, COMMUNICATIONS AND NETWORKS (CECNET), 2013, : 105 - 108
  • [28] Noise Spectrum Estimation with Improved Minimum Controlled Recursive Averaging based on Speech Enhancement Residue
    Wu, Dalei
    Zhu, Wei-Ping
    Swamy, M. N. S.
    2012 IEEE 55TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2012, : 948 - 951
  • [29] ILMSAF based speech enhancement with DNN and noise classification
    Li, Ruwei
    Liu, Yanan
    Shi, Yongqiang
    Dong, Liang
    Cui, Weili
    SPEECH COMMUNICATION, 2016, 85 : 53 - 70
  • [30] Noise adaptive speech recognition based on sequential noise parameter estimation
    Yao, KS
    Paliwal, KK
    Nakamura, S
    SPEECH COMMUNICATION, 2004, 42 (01) : 5 - 23