Powered Cepstral Normalization (P-CN) for Robust Features in Speech Recognition

被引：0

作者：

Hsu, Chang-wen ^{[1
]}

Lee, Lin-shan ^{[1
]}

机构：

[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei, Taiwan

来源：

INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 | 2006年

关键词：

Robust speech recognition; cepstral normalization; cepstral mean and variance normalization;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Cepstral normalization has been popularly used as a powerful approach to produce robust features for speech recognition. Good examples of approaches in this family include the well known Cepstral Mean Subtraction (CMS) and Cepstral Mean and Variance Normalization (CMVN), in which either the first or both the first and the second moments of the Mel-frequency Cepstral Coefficients (MFCCs) are normalized. In this paper, an improved approach of Powered Cepstral Normalization (P-CN) is proposed to normalize the MFCC parameters in the r-th powered domain, where r > 1.0. The basic idea is that when the MFCC parameters are raised to the r-th power, the harmful parts of environmental disturbances may be more emphasized than the speech features which are relatively smooth. Therefore performing the normalization in the domain of the r-th power may be more helpful. But the value of r should not be too large because in that case the environmental disturbances may be exaggerated and further corrupt the speech features. This approach is computationally simple and efficient. Initial experimental results on AURORA 2.0 testing environment showed that significant improvements in recognition rates are consistently obtainable under all different noisy conditions.

引用

页码：2538 / 2541

页数：4

共 50 条

[1] Extended Powered Cepstral Normalization (P-CN) with Range Equalization for Robust Features in Speech Recognition
Hsu, Chang-wen
Lee, Lin-shan
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2816 - 2819
[2] Cepstral gain normalization for noise robust speech recognition
Yoshizawa, Shingo
Hayasaka, Noboru
Wada, Naoya
Miyanaga, Yoshikazu
[J]. ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, 1600, (I209-I212):
[3] Cepstral shape normalization (CSN) for robust speech recognition
Du, Jun
Wang, Ren-Hua
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4389 - 4392
[4] PARAMETRIC CEPSTRAL MEAN NORMALIZATION FOR ROBUST SPEECH RECOGNITION
Kalinli, Ozlem
Bhattacharya, Gautam
Weng, Chao
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6735 - 6739
[5] Cepstral gain normalization for noise robust speech recognition
Yoshizawa, S
Hayasaka, N
Wada, N
Miyanaga, Y
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 209 - 212
[6] Cepstral amplitude range normalization for noise robust speech recognition
Yoshizawa, S
Hayasaka, N
Wada, N
Miyanaga, Y
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (08): : 2130 - 2137
[7] A Cepstral PDF Normalization Method for Noise Robust Speech Recognition
Suk, Yong Ho
Choi, Seung Ho
[J]. ADVANCES IN COMPUTER SCIENCE, ENVIRONMENT, ECOINFORMATICS, AND EDUCATION, PT II, 2011, 215 : 34 - +
[8] Robust Speech Recognition Combining Cepstral and Articulatory Features
Zha, Zhuan-ling
Hu, Jin
Zhan, Qing-ran
Shan, Ya-hui
Xie, Xiang
Wang, Jing
Cheng, Hao-bo
[J]. PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 1401 - 1405
[9] Cepstral vector normalization based on stereo data for robust speech recognition
Buera, Luis
Lleida, Eduardo
Miguel, Antonio
Ortega, Alfonso
Saz, Oscar
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 1098 - 1113
[10] Higher Order Cepstral Moment Normalization for Improved Robust Speech Recognition
Su, Chang-Wen
Lee, Lin-Shan
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (02): : 205 - 220

← 1 2 3 4 5 →