A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR

被引：0

作者：

Wang, Syu-Siang ^{[1
]}

Hung, Jeih-Weih ^{[2
]}

Tsao, Yu ^{[1
]}

机构：

[1] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei 115, Taiwan

[2] Natl Chi Nan Univ, Dept Elect Engn, Nantou, Taiwan

来源：

2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING | 2012年

关键词：

discrete wavelet transform; CMS; CMVN; RASTA; noise robust; speech recognition; SPEECH; NOISE;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a cepstral subband normalization (CSN) approach for robust speech recognition. The CSN approach first applies the discrete wavelet transform (DWT) to decompose the original cepstral feature sequence into low and high frequency band (LFB and HFB) parts. Then, CSN normalizes the LFB components and zeros out the HFB components. Finally, an inverse DWT is applied on LFB and HFB components to form the normalized cepstral features. When using the Haar functions as the DWT bases, the calculation of CSN can be processed efficiently with a 50% reduction on the amount of feature components. In addition, our experimental results on the Aurora-2 task show that CSN outperforms the conventional cepstral mean subtraction (CMS), cepstral mean and variance normalization (CMVN), and histogram equalization (HEQ). We also integrate CSN with advanced front-end (AFE) for feature extraction. Experimental results indicate that the integrated AFE+CSN achieves notable improvements over the original AFE. The simple calculation, compact in form, and effective noise robustness properties enable CSN to perform suitably for mobile applications.

引用

页码：141 / 145

页数：5

共 50 条

[1] Overlapped sub-band modulation spectrum normalization techniques for robust speech recognition
Fan, Hao-teng
Yeh, Wei-jeih
Hung, Jeih-weih
2013 10TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2013, : 1035 - 1039
[2] Wavelet Packet Sub-band Cepstral Coefficient for Speaker Verification
Min, Hang
Wei, Guangcun
Xu, Yunfei
Zhang, Yanna
2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2022, : 1713 - 1717
[3] GAMMATONE SUB-BAND MAGNITUDE-DOMAIN DEREVERBERATION FOR ASR
Kumar, Kshitiz
Singh, Rita
Raj, Bhiksha
Stern, Richard
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4604 - 4607
[4] Sub-band based histogram equalization in cepstral domain for speech recognition
Joshi, Vikas
Bilgi, Raghvendra
Umesh, S.
Garcia, Luz
Benitez, Carmen
SPEECH COMMUNICATION, 2015, 69 : 46 - 65
[5] Intra-frame cepstral sub-band weighting and histogram equalization for noise-robust speech recognition
Hung J.-W.
Fan H.-T.
Hung, Jeih-weih (jwhung@ncnu.edu.tw), 1600, Springer International Publishing (2013):
[6] Sub-Band Based Attention for Robust Polyp Segmentation
Fang, Xianyong
Shi, Yuqing
Guo, Qingqing
Wang, Linbo
Liu, Zhengyi
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 736 - 744
[7] TfCleanformer: A streaming, array-agnostic, full- and sub-band modeling front-end for robust ASR
Heitkaemper, Jens
Caroselli, Joe
Narayanan, Arun
Howard, Nathan
INTERSPEECH 2024, 2024, : 4473 - 4477
[8] Vocal Tract Length Normalization and Sub-Band Spectral Subtraction Based Robust Assamese Vowel Recognition System
Gogoi, Swapnanil
Bhattacharjee, Utpal
2017 INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC), 2017, : 32 - 35
[9] Sub-band Robust GNSS Signal Processing for Jamming Mitigation
Borio, Daniele
2018 EUROPEAN NAVIGATION CONFERENCE (ENC), 2018, : 72 - 83
[10] High robust watermarking technique using sub-band filtering
Hsia, SC
Jou, IC
10TH INTERNATIONAL MULTIMEDIA MODELLING CONFERENCE, PROCEEDINGS, 2004, : 72 - 78

← 1 2 3 4 5 →