Combining feature compensation and Weighted Viterbi Decoding for noise robust speech recognition with limited adaptation data

被引：0

作者：

Cui, XD ^{[1
]}

Alwan, A ^{[1
]}

机构：

[1] Univ Calif Los Angeles, Dept Elect Engn, Los Angeles, CA 90095 USA

来源：

2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING | 2004年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Acoutic models trained with clean speech signals suffer in the presence of background noise. In some situations, only a limited amount of noisy data of the new environment is available based on which the clean models could be adapted. A feature compensation approach employing polynomial regression of the signal-to-noise ratio (SNR) is proposed in this paper. While clean acoustic models remain unchanged, a bias which is a polynomial function of utterance SNR is estimated and removed from the noisy feature. Depending on the amount of noisy data available, the algorithm could be flexibly carried out at different levels of granularity. Based on the Euclidean distance, the similarity between the residual distribution and the clean models are estimated and used as the confidence factor in a back-end Weighted Viterbi Decoding (WVD) algorithm. With limited amounts of noisy data, the feature compensation algorithm outperforms Maximum Likelihood Linear Regression (MLLR) for the Aurora2 database. Weighted Viterbi decoding further improves recognition accuracy.

引用

页码：969 / 972

页数：4

共 50 条

[31] Two-domain feature compensation for robust speech recognition
Shen, HF
Liu, G
Guo, J
Li, QX
ADVANCES IN NEURAL NETWORKS - ISNN 2005, PT 2, PROCEEDINGS, 2005, 3497 : 351 - 356
[32] Combining acoustic and articulatory feature information for robust speech recognition
Kirchhoff, K
Fink, GA
Sagerer, G
SPEECH COMMUNICATION, 2002, 37 (3-4) : 303 - 319
[33] Signal trajectory based noise compensation for robust speech recognition
Yan, Zhi-Jie
Zhou, Jian-Lai
Soong, Frank
Wang, Ren-Hua
CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 335 - +
[34] Combining speech enhancement with feature post-processing for robust speech recognition
Lei, Jianjun
Guo, Jun
Liu, Gang
Wang, Jian
Nie, Xiangfei
Yang, Zhen
INTELLIGENT COMPUTING IN SIGNAL PROCESSING AND PATTERN RECOGNITION, 2006, 345 : 773 - 778
[35] RAPID JOINT SPEAKER AND NOISE COMPENSATION FOR ROBUST SPEECH RECOGNITION
Chin, K. K.
Xu, Haitian
Gales, Mark J. F.
Breslin, Catherine
Knill, Kate
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5500 - 5503
[36] Multi-Channel Feature Adaptation for Robust Speech Recognition
Zhang, Zhaofeng
Xiao, Xiong
Wang, Longbiao
Dang, Jianwu
Iwahashi, Masahiro
Chng, Eng Siong
Li, Haizhou
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[37] Noise robust speech recognition using feature compensation based on polynomial fly regression of utterance SNR
Cui, XD
Alwan, A
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (06): : 1161 - 1172
[38] Dual-channel VTS feature compensation for noise-robust speech recognition on mobile devices
Lopez-Espejo, Ivan
Peinado, Antonio M.
Gomez, Angel M.
Gonzalez, Jose A.
IET SIGNAL PROCESSING, 2017, 11 (01) : 17 - 25
[39] Joint Uncertainty Decoding With Predictive Methods for Noise Robust Speech Recognition
Xu, Haitian
Gales, Mark J. F.
Chin, K. K.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1665 - 1676
[40] Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition
Leutnant, Volker
Krueger, Alexander
Haeb-Umbach, Reinhold
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (08): : 1640 - 1652

← 1 2 3 4 5 →