Modulation Spectrum Power-law Expansion for Robust Speech Recognition

被引:0
|
作者
Fan, Hao-Teng [1 ]
Ye, Zi-Hao [1 ]
Hung, Jeih-Weih [1 ]
机构
[1] Natl Chi Nan Univ, Dept Elect Engn, Nantou, Taiwan
关键词
EQUALIZATION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we present a novel approach to enhancing the speech features in the modulation spectrum for better recognition performance in noise-corrupted environments. In the presented approach, termed modulation spectrum power-law expansion (MSPLE), the speech feature temporal stream is first pre-processed by some statistics compensation technique, such as mean and variance normalization (MVN), cepstral gain normalization (CGN) and MVN plus ARMA filtering (MVA), and then the magnitude part of the modulation spectrum (Fourier transform) for the feature stream is raised to a power (exponentiated). We find that MSPLE can highlight the speech components and reduce the noise distortion existing in the statistics-compensated speech features. With the Aurora-2 digit database task, experimental results reveal that the above process can consistently achieve very promising recognition accuracy under a wide range of noise-corrupted environments. MSPLE operated on MVN-preprocessed features brings about 55% in error rate reduction relative to the MFCC baseline and significantly outperforms the single MVN. Furthermore, performing MSPLE on the lower sub-band modulation spectra gives the results very close to those from the full-band modulation spectra updated by MSPLE, indicating that a less-complicated MSPLE suffices to produce noise-robust speech features.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Normalizing the speech modulation spectrum for robust speech recognition
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1021 - +
  • [2] Modulation spectrum equalization for robust speech recognition
    Sun, Liang-Che
    Hsu, Chang-Wen
    Lee, Lin-Shan
    [J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 81 - 86
  • [3] Modulation Spectrum Augmentation for Robust Speech Recognition
    Yan, Bi-Cheng
    Liu, Shih-Hung
    Chen, Berlin
    [J]. PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION SCIENCE AND SYSTEM, AISS 2019, 2019,
  • [4] Feature Extraction for Robust Speech Recognition using a Power-Law Nonlinearity and Power-Bias Subtraction
    Kim, Chanwoo
    Stern, Richard M.
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 28 - 31
  • [5] On properties of modulation spectrum for robust automatic speech recognition
    Kanedera, N
    Hermansky, H
    Arai, T
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 613 - 616
  • [6] Modulation spectrum exponential weighting for robust speech recognition
    Fan, Hao-teng
    Lian, Yi-cheng
    Hung, Jeih-weih
    [J]. 2012 12TH INTERNATIONAL CONFERENCE ON ITS TELECOMMUNICATIONS (ITST-2012), 2012, : 812 - 816
  • [7] Modulation Spectrum Equalization for Improved Robust Speech Recognition
    Sun, Liang-Che
    Lee, Lin-Shan
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (03): : 828 - 843
  • [8] Modified MFCC Methods Based on Differential Power Spectrum and Power Law for Robust Speech Recognition
    Li, Gao-yun
    Wang, Xiong
    [J]. COMPUTER SCIENCE AND TECHNOLOGY (CST2016), 2017, : 696 - 702
  • [9] Improved modulation spectrum enhancement methods for robust speech recognition
    Hung, Jeih-weih
    Tu, Wen-hsiang
    Lai, Chien-chou
    [J]. SIGNAL PROCESSING, 2012, 92 (11) : 2791 - 2814
  • [10] Improved modulation spectrum normalization techniques for robust speech recognition
    Pan, Chi-an
    Wang, Chieh-cheng
    Hung, Jeih-weih
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4089 - 4092