Cepstral statistics compensation and normalization using online pseudo stereo codebooks for robust speech recognition in additive noise environments

被引:1
|
作者
Hung, Jeih-Weih [1 ]
机构
[1] Natl Chi Nan Univ, Dept Elect Engn, Nantou County, Taiwan
关键词
cepstral statistics compensation; pseudo stereo codebooks; linear least squares; quadratic least squares;
D O I
10.1093/ietisy/e91-d.2.296
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes several cepstral statistics compensation and normalization algorithms which alleviate the effect of additive noise on cepstral features for speech recognition. The algorithms are simple yet efficient noise reduction techniques that use online-constructed pseudo-stereo codebooks to evaluate the statistics in both clean and noisy environments. The process yields transformations for both clean speech cepstra and noise-corrupted speech cepstra, or for noise-corrupted speech cepstra only, so that the statistics of the transformed speech cepstra are similar for both environments. Experimental results show that these codebook-based algorithms can provide significant performance gains compared to results obtained by using conventional utterance-based normalization approaches. The proposed codebook-based cesptral mean and variance normalization (C-CMVN), linear least squares (LLS) and quadratic least squares (QLS) outperform utterance-based CMVN (U-CMVN) by 26.03%, 22.72% and 27.48%, respectively, in relative word error rate reduction for experiments conducted on Test Set A of the Aurora-2 digit database.
引用
收藏
页码:296 / 311
页数:16
相关论文
共 38 条
  • [31] Noise reduction algorithm for robust speech recognition using minimum statistics method and neural network VAD
    Kos, Marko
    2007 14TH INTERNATIONAL WORKSHOP ON SYSTEMS, SIGNALS, & IMAGE PROCESSING & EURASIP CONFERENCE FOCUSED ON SPEECH & IMAGE PROCESSING, MULTIMEDIA COMMUNICATIONS & SERVICES, 2007, : 36 - 39
  • [32] Noise robust speech recognition using feature compensation based on polynomial fly regression of utterance SNR
    Cui, XD
    Alwan, A
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (06): : 1161 - 1172
  • [33] NON-LINEAR NOISE COMPENSATION FOR ROBUST SPEECH RECOGNITION USING GAUSS-NEWTON METHOD
    Zhao, Yong
    Juang, Biing-Hwang
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4796 - 4799
  • [34] STEREO-BASED STOCHASTIC MAPPING WITH CONTEXT USING PROBABILISTIC PCA FOR NOISE ROBUST AUTOMATIC SPEECH RECOGNITION
    Cui, Xiaodong
    Afify, Mohamed
    Zhou, Bowen
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4705 - 4708
  • [35] Effective energy feature compensation using modified log-energy dynamic range normalization for robust speech recognition
    Lee, Yoonjae
    Ko, Hanseok
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2007, E90B (06) : 1508 - 1511
  • [36] Efficient online target speech extraction using DOA-constrained independent component analysis of stereo data for robust speech recognition
    Kim, Minook
    Park, Hyung-Min
    SIGNAL PROCESSING, 2015, 117 : 126 - 137
  • [37] NOISE-ROBUST WHISPERED SPEECH RECOGNITION USING A NON-AUDIBLE-MURMUR MICROPHONE WITH VTS COMPENSATION
    Yang, Chen-Yu
    Brown, Georgina
    Lu, Liang
    Yamagishi, Junichi
    King, Simon
    2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 220 - 223
  • [38] Improved HMM parameter compensation method for noise-robust speech recognition using state-dependent association factor
    Chang, YH
    Chung, YJ
    ELECTRONICS LETTERS, 1998, 34 (08) : 724 - 725