Normalizing the speech modulation spectrum for robust speech recognition

被引：0

作者：

Xiao, Xiong ^{[1
,2
]}

Chng, Eng Siong ^{[1
]}

Li, Haizhou ^{[1
,2
]}

机构：

[1] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore

[2] Inst Infocomm Res, Singapore, Singapore

来源：

2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 | 2007年

关键词：

speech recognition; feature normalization; modulation spectrum; square-root Wiener filter; temporal filter;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a novel feature normalization technique for robust speech recognition. The proposed technique normalizes the temporal structure of the feature to reduce the feature variation due to environmental interferences. Specifically, it normalizes the utterance-dependent feature modulation spectrum to a reference function by filtering the feature using a square-root Wiener filter in the temporal domain. We show experimentally that the proposed technique when combined with mean and variance normalization technique (MVN) reduces the word error rate significantly on the AURORA-2 task, with relative error rate reduction 69.11% compared to the base me.

引用

页码：1021 / +

页数：2

共 50 条

[1] Modulation spectrum equalization for robust speech recognition
Sun, Liang-Che
Hsu, Chang-Wen
Lee, Lin-Shan
[J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 81 - 86
[2] Modulation Spectrum Augmentation for Robust Speech Recognition
Yan, Bi-Cheng
Liu, Shih-Hung
Chen, Berlin
[J]. PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION SCIENCE AND SYSTEM, AISS 2019, 2019,
[3] On properties of modulation spectrum for robust automatic speech recognition
Kanedera, N
Hermansky, H
Arai, T
[J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 613 - 616
[4] Modulation spectrum exponential weighting for robust speech recognition
Fan, Hao-teng
Lian, Yi-cheng
Hung, Jeih-weih
[J]. 2012 12TH INTERNATIONAL CONFERENCE ON ITS TELECOMMUNICATIONS (ITST-2012), 2012, : 812 - 816
[5] Modulation Spectrum Equalization for Improved Robust Speech Recognition
Sun, Liang-Che
Lee, Lin-Shan
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (03): : 828 - 843
[6] Improved modulation spectrum enhancement methods for robust speech recognition
Hung, Jeih-weih
Tu, Wen-hsiang
Lai, Chien-chou
[J]. SIGNAL PROCESSING, 2012, 92 (11) : 2791 - 2814
[7] Improved modulation spectrum normalization techniques for robust speech recognition
Pan, Chi-an
Wang, Chieh-cheng
Hung, Jeih-weih
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4089 - 4092
[8] Normalization of the Speech Modulation Spectra for Robust Speech Recognition
Xiao, Xiong
Chng, Eng Siong
Li, Haizhou
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (08): : 1662 - 1674
[9] Sub-band Modulation Spectrum Compensation for Robust Speech Recognition
Tu, Wen-hsiang
Huang, Sheng-Yuan
Hung, Jeih-weih
[J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 261 - 265
[10] Modulation Spectrum Power-law Expansion for Robust Speech Recognition
Fan, Hao-Teng
Ye, Zi-Hao
Hung, Jeih-Weih
[J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,

← 1 2 3 4 5 →