Spectral Reconstruction and Noise Model Estimation Based on a Masking Model for Noise Robust Speech Recognition

被引:0
|
作者
Jose A. Gonzalez
Angel M. Gómez
Antonio M. Peinado
Ning Ma
Jon Barker
机构
[1] University of Sheffield,Department of Computer Science
[2] Telematics and Communications,Department of Signal Theory
关键词
Speech recognition; Noise robustness; Feature compensation; Noise model estimation; Missing data imputation;
D O I
暂无
中图分类号
学科分类号
摘要
An effective way to increase noise robustness in automatic speech recognition (ASR) systems is feature enhancement based on an analytical distortion model that describes the effects of noise on the speech features. One of such distortion models that has been reported to achieve a good trade-off between accuracy and simplicity is the masking model. Under this model, speech distortion caused by environmental noise is seen as a spectral mask and, as a result, noisy speech features can be either reliable (speech is not masked by noise) or unreliable (speech is masked). In this paper, we present a detailed overview of this model and its applications to noise robust ASR. Firstly, using the masking model, we derive a spectral reconstruction technique aimed at enhancing the noisy speech features. Two problems must be solved in order to perform spectral reconstruction using the masking model: (1) mask estimation, i.e. determining the reliability of the noisy features, and (2) feature imputation, i.e. estimating speech for the unreliable features. Unlike missing data imputation techniques where the two problems are considered as independent, our technique jointly addresses them by exploiting a priori knowledge of the speech and noise sources in the form of a statistical model. Secondly, we propose an algorithm for estimating the noise model required by the feature enhancement technique. The proposed algorithm fits a Gaussian mixture model to the noise by iteratively maximising the likelihood of the noisy speech signal so that noise can be estimated even during speech-dominating frames. A comprehensive set of experiments carried out on the Aurora-2 and Aurora-4 databases shows that the proposed method achieves significant improvements over the baseline system and other similar missing data imputation techniques.
引用
收藏
页码:3731 / 3760
页数:29
相关论文
共 50 条
  • [31] A Dynamic Segment Based Statistical Derived PNN Model for Noise Robust Speech Recognition
    Junjea, Kapil
    [J]. 2015 THIRD INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), 2015, : 320 - 325
  • [32] A Novel Model Characteristics for Noise-Robust Automatic Speech Recognition Based on HMM
    Rafieee, M. Saadeq
    Khazaei, Ali Akbar
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND INFORMATION SECURITY (WCNIS), VOL 2, 2010, : 215 - 218
  • [33] Sequential MAP estimation based speech feature enhancement for noise robust speech recognition
    Jia, C
    Ding, P
    Xu, B
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 412 - 415
  • [34] Psychoacoustic Model Compensation for Robust Continuous Speech Recognition in Additive Noise
    Das, Biswajit
    Panda, Ashish
    [J]. 2015 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2015, : 511 - 515
  • [35] A posterior union model for improved robust speech recognition in nonstationary noise
    Ming, J
    Smith, FJ
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 420 - 423
  • [36] Noise robust automatic speech recognition with adaptive quantile based noise estimation and speech band emphasizing filter bank
    Bonde, CS
    Graversen, C
    Gregersen, AG
    Ngo, KH
    Normark, K
    Purup, M
    Thorsen, T
    Lindberg, B
    [J]. NONLINEAR ANALYSES AND ALGORITHMS FOR SPEECH PROCESSING, 2005, 3817 : 291 - 302
  • [37] Noise suppression based on wavelet packet decomposition and quantile noise estimation for robust automatic speech recognition
    Rank, Erhard
    Van Pham, Tuan
    Kubin, Gernot
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 477 - 480
  • [38] Variational noise model composition through model perturbation for robust speech recognition with time-varying background noise
    Kim, Wooil
    Hansen, John H. L.
    [J]. SPEECH COMMUNICATION, 2011, 53 (04) : 451 - 464
  • [39] A PITCH BASED NOISE ESTIMATION TECHNIQUE FOR ROBUST SPEECH RECOGNITION WITH MISSING DATA
    Morales-Cordovilla, Juan A.
    Ma, Ning
    Sanchez, Victoria
    Carmona, Jose L.
    Peinado, Antonio M.
    Barker, Jon
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4808 - 4811
  • [40] Novel frequency masking curves for noise-robust automatic speech recognition
    Chen, Chia-Ping
    Yeh, Ja-Zang
    Wu, Bo-Feng
    [J]. JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2013, 36 (06) : 696 - 703