A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR

被引:0
|
作者
Wang, Syu-Siang [1 ]
Hung, Jeih-Weih [2 ]
Tsao, Yu [1 ]
机构
[1] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei 115, Taiwan
[2] Natl Chi Nan Univ, Dept Elect Engn, Nantou, Taiwan
关键词
discrete wavelet transform; CMS; CMVN; RASTA; noise robust; speech recognition; SPEECH; NOISE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a cepstral subband normalization (CSN) approach for robust speech recognition. The CSN approach first applies the discrete wavelet transform (DWT) to decompose the original cepstral feature sequence into low and high frequency band (LFB and HFB) parts. Then, CSN normalizes the LFB components and zeros out the HFB components. Finally, an inverse DWT is applied on LFB and HFB components to form the normalized cepstral features. When using the Haar functions as the DWT bases, the calculation of CSN can be processed efficiently with a 50% reduction on the amount of feature components. In addition, our experimental results on the Aurora-2 task show that CSN outperforms the conventional cepstral mean subtraction (CMS), cepstral mean and variance normalization (CMVN), and histogram equalization (HEQ). We also integrate CSN with advanced front-end (AFE) for feature extraction. Experimental results indicate that the integrated AFE+CSN achieves notable improvements over the original AFE. The simple calculation, compact in form, and effective noise robustness properties enable CSN to perform suitably for mobile applications.
引用
收藏
页码:141 / 145
页数:5
相关论文
共 50 条
  • [41] General adaptive sub-band DCT
    Hossen, A
    Heute, U
    SIGNAL ANALYSIS & PREDICTION I, 1997, : 481 - 484
  • [42] Spectral extrapolation in sub-band coding
    Cafforio, C
    DiSciascio, E
    Guaragnella, C
    1996 IEEE DIGITAL SIGNAL PROCESSING WORKSHOP, PROCEEDINGS, 1996, : 13 - 16
  • [43] A sub-band spectral analysis for electrocardiography
    Ching-En Tseng
    Jia-Yush Yen, Jr.
    Wei-Chien Chang
    2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, : 3244 - +
  • [44] Sub-band reserved likelihood ratio-based robust voice activity detection
    School of Electronic and Information Engineering, South China University of Technology, Guangzhou
    510641, China
    Huazhong Ligong Daxue Xuebao, 11 (78-82):
  • [45] Adaptive Fusion of Sub-band Particle Filters for Robust Tracking of Multiple Objects in Video
    Mahmoud, Ahmed
    Sherif, Sherif S.
    ADVANCES IN COMPUTER VISION, VOL 2, 2020, 944 : 314 - 328
  • [46] Robust speaker recognition based on filtering in autocorrelation domain and sub-band feature recombination
    Kim, Sungtak
    Ji, Miyoung
    Kim, Hoirin
    PATTERN RECOGNITION LETTERS, 2010, 31 (07) : 593 - 599
  • [47] An Improved Robust Statistical Voice Activity Detection based on Sub-band Periodic Intensity
    He, Weijun
    Feng, Xiaohui
    Zhu, Zhengyu
    Zhou, Weili
    2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 2171 - 2175
  • [48] Cepstral amplitude range normalization for noise robust speech recognition
    Yoshizawa, S
    Hayasaka, N
    Wada, N
    Miyanaga, Y
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (08): : 2130 - 2137
  • [49] Robust speech recognition using compression of Mel sub-band energies and temporal filtering
    Moradi N.
    Nasersharif B.
    Akbari A.
    2010 5th International Symposium on Telecommunications, IST 2010, 2010, : 760 - 763
  • [50] A Cepstral PDF Normalization Method for Noise Robust Speech Recognition
    Suk, Yong Ho
    Choi, Seung Ho
    ADVANCES IN COMPUTER SCIENCE, ENVIRONMENT, ECOINFORMATICS, AND EDUCATION, PT II, 2011, 215 : 34 - +