CEPSTRAL NOISE SUBTRACTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION

被引:0
|
作者
Rehr, Robert [1 ]
Gerkmann, Timo
机构
[1] Carl von Ossietzky Univ Oldenburg, Dept Med Phys, Speech Signal Proc Grp, Oldenburg, Germany
关键词
automatic speech recognition; cepstral analysis; feature normalization; noise robustness; speech enhancement;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The robustness of speech recognizers towards noise can be increased by normalizing the statistical moments of the Mel-frequency cepstral coefficients (MFCCs), e.g. by using cepstral mean normalization (CMN) or cepstral mean and variance normalization (CMVN). The necessary statistics are estimated over a long time window and often, a complete utterance is chosen. Consequently, changes in the background noise can only be tracked to a limited extent which poses a restriction to the performance gain that can be achieved by these techniques. In contrast, algorithms recently developed for single-channel speech enhancement allow to track the background noise quickly. In this paper, we aim at combining speech enhancement techniques and feature normalization methods. For this, we propose to transform an estimate of the noise power spectral density to the MFCC domain, where we subtract it from the noisy MFCCs. This is followed by a conventional CMVN. For background noises that are too instationary for CMVN but can be tracked by the noise estimator, we show that this processing leads to an improvement in comparison to the sole application of CMVN. The observed performance gain emerges especially in low signal-to-noise-ratios.
引用
收藏
页码:375 / 378
页数:4
相关论文
共 50 条
  • [1] Cepstral gain normalization for noise robust speech recognition
    Yoshizawa, Shingo
    Hayasaka, Noboru
    Wada, Naoya
    Miyanaga, Yoshikazu
    ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, 1600, (I209-I212):
  • [2] Cepstral gain normalization for noise robust speech recognition
    Yoshizawa, S
    Hayasaka, N
    Wada, N
    Miyanaga, Y
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 209 - 212
  • [3] Cepstral amplitude range normalization for noise robust speech recognition
    Yoshizawa, S
    Hayasaka, N
    Wada, N
    Miyanaga, Y
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (08): : 2130 - 2137
  • [4] A Cepstral PDF Normalization Method for Noise Robust Speech Recognition
    Suk, Yong Ho
    Choi, Seung Ho
    ADVANCES IN COMPUTER SCIENCE, ENVIRONMENT, ECOINFORMATICS, AND EDUCATION, PT II, 2011, 215 : 34 - +
  • [5] Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition
    Garner, Philip N.
    SPEECH COMMUNICATION, 2011, 53 (08) : 991 - 1001
  • [6] Noise Robust Speech Recognition System using Mel Cepstral and Genetic Algorithm
    Mamta, Garg
    Shatru, Arora Ajat
    Savita, Gupta
    2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, AND OPTIMIZATION TECHNIQUES (ICEEOT), 2016, : 3151 - 3155
  • [7] Cepstral domain segmental feature vector normalization for noise robust speech recognition
    Viikki, O
    Laurila, K
    SPEECH COMMUNICATION, 1998, 25 (1-3) : 133 - 147
  • [8] Noise robust automatic speech recognition: review and analysis
    Dua M.
    Akanksha
    Dua S.
    International Journal of Speech Technology, 2023, 26 (02) : 475 - 519
  • [9] Robust automatic speech recognition in the presence of impulsive noise
    Potamitis, I
    Fakotakis, N
    Kokkinakis, G
    ELECTRONICS LETTERS, 2001, 37 (12) : 799 - 800
  • [10] An overview of noise-robust automatic speech recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    IEEE Transactions on Audio, Speech and Language Processing, 2014, 22 (04): : 745 - 777