A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR

被引:0
|
作者
Wang, Syu-Siang [1 ]
Hung, Jeih-Weih [2 ]
Tsao, Yu [1 ]
机构
[1] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei 115, Taiwan
[2] Natl Chi Nan Univ, Dept Elect Engn, Nantou, Taiwan
来源
2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING | 2012年
关键词
discrete wavelet transform; CMS; CMVN; RASTA; noise robust; speech recognition; SPEECH; NOISE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a cepstral subband normalization (CSN) approach for robust speech recognition. The CSN approach first applies the discrete wavelet transform (DWT) to decompose the original cepstral feature sequence into low and high frequency band (LFB and HFB) parts. Then, CSN normalizes the LFB components and zeros out the HFB components. Finally, an inverse DWT is applied on LFB and HFB components to form the normalized cepstral features. When using the Haar functions as the DWT bases, the calculation of CSN can be processed efficiently with a 50% reduction on the amount of feature components. In addition, our experimental results on the Aurora-2 task show that CSN outperforms the conventional cepstral mean subtraction (CMS), cepstral mean and variance normalization (CMVN), and histogram equalization (HEQ). We also integrate CSN with advanced front-end (AFE) for feature extraction. Experimental results indicate that the integrated AFE+CSN achieves notable improvements over the original AFE. The simple calculation, compact in form, and effective noise robustness properties enable CSN to perform suitably for mobile applications.
引用
收藏
页码:141 / 145
页数:5
相关论文
共 50 条
  • [31] Noise robust speaker identification using sub-band weighting in multi-band approach
    Kim, Sungtak
    Ji, Mikyong
    Suh, Youngjoo
    Kim, Hoirin
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (12) : 2110 - 2114
  • [32] Noise Aware Sub-band Locality Preserving Projection for Robust Speech Recognition
    Karevan, Zahra
    Akbari, Ahmad
    Nasersharif, Babak
    ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING, AISP 2013, 2014, 427 : 203 - +
  • [33] WAVELET SUB-BAND BASED TEMPORAL FEATURES FOR ROBUST HINDI PHONEME RECOGNITION
    Farooq, O.
    Datta, S.
    Shrotriya, M. C.
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2010, 8 (06) : 847 - 859
  • [34] ON THE PERFORMANCE OF THE ROBUST ACOUSTIC ECHO CANCELLATION SYSTEM WITH DECORRELATION BY SUB-BAND RESAMPLING
    Wung, Jason
    Wada, Ted S.
    Juang, Biing-Hwang
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 635 - 638
  • [35] PARAMETRIC CEPSTRAL MEAN NORMALIZATION FOR ROBUST SPEECH RECOGNITION
    Kalinli, Ozlem
    Bhattacharya, Gautam
    Weng, Chao
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6735 - 6739
  • [36] Cepstral gain normalization for noise robust speech recognition
    Yoshizawa, S
    Hayasaka, N
    Wada, N
    Miyanaga, Y
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 209 - 212
  • [37] Sub-band Filtering in Compressive Domain
    Prakash, Chandra
    Chakka, Vijay Kr
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 496 - 500
  • [38] A study on the pitch extraction detection by linear approximation of sub-band
    Lee, Keun Wang
    Lee, Kwang Hyoung
    Min, So Yeon
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2006, PT 2, 2006, 3981 : 1074 - 1081
  • [39] Sub-band coding of hexagonal images
    Rashid, Md Mamunur
    Alim, Usman R.
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 99
  • [40] Sub-band source coding for HDTV
    Mau, J.
    Bourguignat, E.
    Amor, H.
    EBU Technical Review (European Broadcasting Union) Brussels, 1992, (251):