Comparing Jacorian adaptation with cepstral mean normalization and parallel model combination for noise robust speech recognition

被引：0

作者：

Pärssinen, K ^{[1
]}

Salmela, P ^{[1
]}

Harju, M ^{[1
]}

Kiss, I ^{[1
]}

机构：

[1] Tampere Univ Technol, Inst Digital & Comp Syst, FIN-33101 Tampere, Finland

来源：

2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS | 2002年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, two techniques are researched for Jacobian adaptation (JA) in the presence of additive noise. Since the original concept of JA was presented only for static cepstral coefficients, the performance of JA is researched when it is extended to cover also the delta cepstrum. However, this extension or the original concept can not provide accurate recognition performance when the mismatch between the training and recognition environments is out of the linear range of JA. Hence, this problem can be alleviated to some extent by dividing JA into two steps. At first, the adaptation is done e.g. from clean to the target environment having "high" SNR level. After that, the new JA matrixes are calculated and they are used in the second step to adapt the system to the lower target SNR level. Both of the above adaptation methods have been compared to cepstral mean normalization (CMN) and parallel model combination (PMC) in isolated word recognition task having a vocabulary of 200 English words. The best performace was achieved with PMC but JA showed comparable performace to CMN and outperformed it when JA was done in two steps from SNR of 25 dB to 5 dB. The system was tested with SpeechDat(II) database by adding noise recorded inside a car to the test set utterances at various SNR levels.

引用

页码：193 / 196

页数：4

共 50 条

[41] Extended Powered Cepstral Normalization (P-CN) with Range Equalization for Robust Features in Speech Recognition
Hsu, Chang-wen
Lee, Lin-shan
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2816 - 2819
[42] Extension and Further Analysis of Higher Order Cepstral Moment Normalization (HOCMN) for Robust Features in Speech Recognition
Hsu, Chang-wen
Lee, Lin-shan
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 41 - 44
[43] Unsupervised Data-Driven Feature Vector Normalization With Acoustic Model Adaptation for Robust Speech Recognition
Buera, Luis
Miguel, Antonio
Saz, Oscar
Ortega, Alfonso
Lleida, Eduardo
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (02): : 296 - 309
[44] Histogram Equalization to Model Adaptation for Robust Speech Recognition
Suh, Youngjoo
Kim, Hoirin
[J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2010,
[45] Histogram Equalization to Model Adaptation for Robust Speech Recognition
Youngjoo Suh
Hoirin Kim
[J]. EURASIP Journal on Advances in Signal Processing, 2010
[46] INTEGRATED DNN-BASED MODEL ADAPTATION TECHNIQUE FOR NOISE-ROBUST SPEECH RECOGNITION
Lee, Kang Hyun
Kang, Woo Hyun
Kang, Tae Gyoon
Kim, Nam Soo
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5245 - 5249
[47] A Study of Additive Noise Model for Robust Speech Recognition
Awatade, Manisha H.
[J]. 2ND INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN SCIENCE AND TECHNOLOGY (ICM2ST-11), 2011, 1414
[48] LASSO ENVIRONMENT MODEL COMBINATION FOR ROBUST SPEECH RECOGNITION
Xiao, Xiong
Li, Jinyu
Chng, Eng Siong
Li, Haizhou
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4305 - 4308
[49] Noise Robust Speech Recognition Using Parallel Model Compensation and Voice Activity Detection Methods
Hizlisoy, Serhat
Tufekci, Zekeriya
[J]. 2016 5TH INTERNATIONAL CONFERENCE ON ELECTRONIC DEVICES, SYSTEMS AND APPLICATIONS (ICEDSA), 2016,
[50] A combination of discriminative and Maximum Likelihood techniques for noise robust speech recognition
Laurila, K
Vasilache, M
Viikki, O
[J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 85 - 88

← 1 2 3 4 5 →