Recognition of noisy speech using dynamic spectral subband centroids

被引:33
|
作者
Chen, JD
Huang, YT
Li, Q
Paliwal, KK
机构
[1] Bell Labs, Murray Hill, NJ 07974 USA
[2] Griffith Univ, Sch Microelect Engn, Nathan, Qld 4111, Australia
关键词
cepstrum; robust speech recognition; subband centroid;
D O I
10.1109/LSP.2003.821689
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Despite their widespread popularity as front-end parameters for speech recognition, the cepstral coefficients derived from either linear prediction analysis or a filter-bank are found to be sensitive to additive noise. In this letter, we discuss the use of spectral subband centroids for robust speech recognition. We show that centroids, if properly selected, can achieve recognition performance comparable to that of the mel-frequency cepstral coefficients (MFCCs) in clean speech, while delivering better performance than MFCC in noisy environments. A procedure is proposed to construct the dynamic centroid feature vector that essentially embodies the transitional spectral information. We discuss some properties of the proposed dynamic features.
引用
收藏
页码:258 / 261
页数:4
相关论文
共 50 条
  • [31] Subband feature extraction using lapped orthogonal transform for speech recognition
    Tufekci, Z
    Gowdy, JN
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 149 - 152
  • [32] JOINT SPECTRAL AND TEMPORAL NORMALIZATION OF FEATURES FOR ROBUST RECOGNITION OF NOISY AND REVERBERATED SPEECH
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4325 - 4328
  • [33] Perceptual speech modeling for noisy speech recognition
    Wu, CH
    Chiu, YH
    Lim, H
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 385 - 388
  • [34] Smoothed spectral subtraction for a frequency-weighted HMM in noisy speech recognition
    Matsumoto, H
    Naitoh, N
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 905 - 908
  • [35] Noisy speech recognition based on speech enhancement
    Wang, Xia
    Tang, Hongmei
    Zhao, Xiaoqun
    SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 713 - +
  • [36] Noise robust speech recognition using subband-crosscorrelation analysis
    Kajita, S
    Takeda, K
    Itakura, F
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1998, E81D (10) : 1079 - 1086
  • [37] Noise estimation using speech/non-speech frame decision and subband spectral tracking
    Lin, Zhong
    Goubran, Rafik A.
    Dansereau, Richard M.
    SPEECH COMMUNICATION, 2007, 49 (7-8) : 542 - 557
  • [38] Speech recognition of spontaneous, noisy speech using auxiliary information in Bayesian networks
    Stephenson, TA
    Magimai-Doss, M
    Bourlard, H
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 20 - 23
  • [39] Using Speech Enhancement Preprocessing for Speech Emotion Recognition in Realistic Noisy Conditions
    Zhou, Hengshun
    Du, Jun
    Tu, Yan-Hui
    Lee, Chin-Hui
    INTERSPEECH 2020, 2020, : 4098 - 4102
  • [40] Quality Estimation of Noisy Speech Using Spectral Entropy Distance
    Mittag, Gabriel
    Moeller, Sebastian
    2019 26TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS (ICT), 2019, : 197 - 201