Cepstral statistics compensation and normalization using online pseudo stereo codebooks for robust speech recognition in additive noise environments

被引：1

作者：

Hung, Jeih-Weih ^{[1
]}

机构：

[1] Natl Chi Nan Univ, Dept Elect Engn, Nantou County, Taiwan

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2008年 / E91D卷 / 02期

关键词：

cepstral statistics compensation; pseudo stereo codebooks; linear least squares; quadratic least squares;

D O I：

10.1093/ietisy/e91-d.2.296

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper proposes several cepstral statistics compensation and normalization algorithms which alleviate the effect of additive noise on cepstral features for speech recognition. The algorithms are simple yet efficient noise reduction techniques that use online-constructed pseudo-stereo codebooks to evaluate the statistics in both clean and noisy environments. The process yields transformations for both clean speech cepstra and noise-corrupted speech cepstra, or for noise-corrupted speech cepstra only, so that the statistics of the transformed speech cepstra are similar for both environments. Experimental results show that these codebook-based algorithms can provide significant performance gains compared to results obtained by using conventional utterance-based normalization approaches. The proposed codebook-based cesptral mean and variance normalization (C-CMVN), linear least squares (LLS) and quadratic least squares (QLS) outperform utterance-based CMVN (U-CMVN) by 26.03%, 22.72% and 27.48%, respectively, in relative word error rate reduction for experiments conducted on Test Set A of the Aurora-2 digit database.

引用

页码：296 / 311

页数：16

共 38 条

[31] Noise reduction algorithm for robust speech recognition using minimum statistics method and neural network VAD
Kos, Marko
2007 14TH INTERNATIONAL WORKSHOP ON SYSTEMS, SIGNALS, & IMAGE PROCESSING & EURASIP CONFERENCE FOCUSED ON SPEECH & IMAGE PROCESSING, MULTIMEDIA COMMUNICATIONS & SERVICES, 2007, : 36 - 39
[32] Noise robust speech recognition using feature compensation based on polynomial fly regression of utterance SNR
Cui, XD
Alwan, A
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (06): : 1161 - 1172
[33] NON-LINEAR NOISE COMPENSATION FOR ROBUST SPEECH RECOGNITION USING GAUSS-NEWTON METHOD
Zhao, Yong
Juang, Biing-Hwang
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4796 - 4799
[34] STEREO-BASED STOCHASTIC MAPPING WITH CONTEXT USING PROBABILISTIC PCA FOR NOISE ROBUST AUTOMATIC SPEECH RECOGNITION
Cui, Xiaodong
Afify, Mohamed
Zhou, Bowen
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4705 - 4708
[35] Effective energy feature compensation using modified log-energy dynamic range normalization for robust speech recognition
Lee, Yoonjae
Ko, Hanseok
IEICE TRANSACTIONS ON COMMUNICATIONS, 2007, E90B (06) : 1508 - 1511
[36] Efficient online target speech extraction using DOA-constrained independent component analysis of stereo data for robust speech recognition
Kim, Minook
Park, Hyung-Min
SIGNAL PROCESSING, 2015, 117 : 126 - 137
[37] NOISE-ROBUST WHISPERED SPEECH RECOGNITION USING A NON-AUDIBLE-MURMUR MICROPHONE WITH VTS COMPENSATION
Yang, Chen-Yu
Brown, Georgina
Lu, Liang
Yamagishi, Junichi
King, Simon
2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 220 - 223
[38] Improved HMM parameter compensation method for noise-robust speech recognition using state-dependent association factor
Chang, YH
Chung, YJ
ELECTRONICS LETTERS, 1998, 34 (08) : 724 - 725

← 1 2 3 4 →