Normalization on temporal modulation transfer function for robust speech recognition

被引:3
|
作者
Lu, X.
Matsuda, S.
Shimizu, T.
Nakamura, S.
机构
关键词
D O I
10.1109/ISUC.2008.74
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we proposed a robust speech feature extraction algorithm for automatic speech recognition which reduced the noise effect in the temporal modulation domain. The proposed algorithm has two steps to deal with the time series of cepstral coefficients. The first step adopted a modulation contrast normalization to normalize the temporal modulation contrast of both clean and noisy speech to be in the same range. The second step adopted an edge-preserved smoothing to attenuate the low modulation components while preserving the high modulation components (edges). We tested our algorithms on speech recognition experiments in both additive noise condition (AURORA-2J data corpus) and reverberant noise condition (convolution of clean speech utterances from AURORA-2J with a smart room impulse response signal). For comparison, the ETSI advanced front-end algorithm (AFE) is used. Our results showed that the algorithm got: (1) for additive noise, 5Z26% relative word error reduction (RWER) rate for clean conditional training (59.37% for AFE), and 33.52% RWER rate for multi-conditional training (35.77% for AFE). (2)for reverberant noise, 51.28% RWER rate (10.17% for AFE).
引用
下载
收藏
页码:16 / 23
页数:8
相关论文
共 50 条
  • [41] Higher order cepstral moment normalization (HOCMN) for robust speech recognition
    Hsu, CW
    Lee, LS
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 197 - 200
  • [42] Silence Feature Normalization for Robust Speech Recognition in Additive Noise Environments
    Wang, Chieh-cheng
    Pan, Chi-an
    Hung, Jeih-weih
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1028 - 1031
  • [43] Silence Energy Normalization for Robust Speech Recognition in Additive Noise Environments
    Tai, Chung-fu
    Hung, Jeih-weih
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2558 - 2561
  • [44] A recursive feature vector normalization approach for robust speech recognition in noise
    Viikki, O
    Bye, D
    Laurila, K
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 733 - 736
  • [45] Cepstral vector normalization based on stereo data for robust speech recognition
    Buera, Luis
    Lleida, Eduardo
    Miguel, Antonio
    Ortega, Alfonso
    Saz, Oscar
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 1098 - 1113
  • [46] Higher Order Cepstral Moment Normalization for Improved Robust Speech Recognition
    Su, Chang-Wen
    Lee, Lin-Shan
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (02): : 205 - 220
  • [47] Adaptive channel normalization based on infornax algorithm for robust speech recognition
    Jung, Ho-Young
    ETRI JOURNAL, 2007, 29 (03) : 300 - 304
  • [48] On properties of modulation spectrum for robust automatic speech recognition
    Kanedera, N
    Hermansky, H
    Arai, T
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 613 - 616
  • [49] Multistream Bandpass Modulation Features for Robust Speech Recognition
    Nemala, Sridhar Krishna
    Patil, Kailash
    Elhilali, Mounya
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1284 - 1287
  • [50] MINIMUM VARIANCE MODULATION FILTER FOR ROBUST SPEECH RECOGNITION
    Chiu, Yu-Hsiang Bosco
    Stern, Richard M.
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3917 - +