Normalizing the speech modulation spectrum for robust speech recognition

被引:0
|
作者
Xiao, Xiong [1 ,2 ]
Chng, Eng Siong [1 ]
Li, Haizhou [1 ,2 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore
[2] Inst Infocomm Res, Singapore, Singapore
关键词
speech recognition; feature normalization; modulation spectrum; square-root Wiener filter; temporal filter;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel feature normalization technique for robust speech recognition. The proposed technique normalizes the temporal structure of the feature to reduce the feature variation due to environmental interferences. Specifically, it normalizes the utterance-dependent feature modulation spectrum to a reference function by filtering the feature using a square-root Wiener filter in the temporal domain. We show experimentally that the proposed technique when combined with mean and variance normalization technique (MVN) reduces the word error rate significantly on the AURORA-2 task, with relative error rate reduction 69.11% compared to the base me.
引用
收藏
页码:1021 / +
页数:2
相关论文
共 50 条
  • [1] Modulation spectrum equalization for robust speech recognition
    Sun, Liang-Che
    Hsu, Chang-Wen
    Lee, Lin-Shan
    [J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 81 - 86
  • [2] Modulation Spectrum Augmentation for Robust Speech Recognition
    Yan, Bi-Cheng
    Liu, Shih-Hung
    Chen, Berlin
    [J]. PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION SCIENCE AND SYSTEM, AISS 2019, 2019,
  • [3] On properties of modulation spectrum for robust automatic speech recognition
    Kanedera, N
    Hermansky, H
    Arai, T
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 613 - 616
  • [4] Modulation spectrum exponential weighting for robust speech recognition
    Fan, Hao-teng
    Lian, Yi-cheng
    Hung, Jeih-weih
    [J]. 2012 12TH INTERNATIONAL CONFERENCE ON ITS TELECOMMUNICATIONS (ITST-2012), 2012, : 812 - 816
  • [5] Modulation Spectrum Equalization for Improved Robust Speech Recognition
    Sun, Liang-Che
    Lee, Lin-Shan
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (03): : 828 - 843
  • [6] Improved modulation spectrum enhancement methods for robust speech recognition
    Hung, Jeih-weih
    Tu, Wen-hsiang
    Lai, Chien-chou
    [J]. SIGNAL PROCESSING, 2012, 92 (11) : 2791 - 2814
  • [7] Improved modulation spectrum normalization techniques for robust speech recognition
    Pan, Chi-an
    Wang, Chieh-cheng
    Hung, Jeih-weih
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4089 - 4092
  • [8] Normalization of the Speech Modulation Spectra for Robust Speech Recognition
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (08): : 1662 - 1674
  • [9] Sub-band Modulation Spectrum Compensation for Robust Speech Recognition
    Tu, Wen-hsiang
    Huang, Sheng-Yuan
    Hung, Jeih-weih
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 261 - 265
  • [10] Modulation Spectrum Power-law Expansion for Robust Speech Recognition
    Fan, Hao-Teng
    Ye, Zi-Hao
    Hung, Jeih-Weih
    [J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,