Speech frame recognition based on less shift sensitive wavelet filter banks

被引：1

作者：

Tohidypour, Hamid Reza ^{[1
]}

Banitalebi-Dehkordi, Amin ^{[1
]}

机构：

[1] Univ British Columbia, Dept Elect & Comp Engn, Digital Multimedia Lab, Vancouver, BC V6T 1Z4, Canada

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2016年 / 10卷 / 04期

关键词：

Dual-tree complex wavelet transform (DT-CWT); Four-channel double-density discrete wavelet transform (FCDDDWT); Redundant wavelet filter bank (RWFB); Wavelet transform (WT); Speech frame recognition; Perceptual dual-tree complex wavelet filter bank; ROBUST; REPRESENTATION; COEFFICIENTS;

D O I：

10.1007/s11760-015-0787-z

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The wavelet transform possesses multi-resolution property and high localization performance; hence, it can be optimized for speech recognition. In our previous work, we show that redundant wavelet filter bank parameters work better in speech recognition task, because they are much less shift sensitive than those of critically sampled discrete wavelet transform (DWT). In this paper, three types of wavelet representations are introduced, including features based on dual-tree complex wavelet transform (DT-CWT), perceptual dual-tree complex wavelet transform, and four-channel double-density discrete wavelet transform (FCDDDWT). Then, appropriate filter values for DT-CWT and FCDDDWT are proposed. The performances of the proposed wavelet representations are compared in a phoneme recognition task using special form of the time-delay neural networks. Performance evaluations confirm that dual-tree complex wavelet filter banks outperform conventional DWT in speech recognition systems. The proposed perceptual dual-tree complex wavelet filter bank results in up to approximately 9.82 % recognition rate increase, compared to the critically sampled two-channel wavelet filter bank.

引用

页码：633 / 637

页数：5

共 50 条

[31] Adaptive wavelet EMG compression based on local optimization of filter banks
Paiva, Juliana Pereira Lisboa M.
Kelencz, Carlos Alberto
Paiva, Henrique Mohallem
Galvao, Roberto Kawakami H.
Magini, Marcio
[J]. PHYSIOLOGICAL MEASUREMENT, 2008, 29 (07) : 843 - 856
[32] A signal processing method based on wavelet filter banks for vortex flowmeter
Hefei University of Technology, Hefei 230009, China
[J]. Jiliang Xuebao, 2006, 2 (133-136):
[33] Directional filter banks for wavelet decomposition of images based on the radon transform
von Borries, R. E.
Miosso, C. Jacques
Potes, C. M.
[J]. CONFERENCE RECORD OF THE FORTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1-5, 2007, : 2095 - 2099
[34] Palmprint Identification Based on Non-separable Wavelet Filter Banks
Wu, Jie
You, Xinge
Tang, Yuan Yan
Cheung, Yiu-ming
[J]. 19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 2917 - 2920
[35] Invariant pattern recognition filter based on the Wavelet transform
Zalevsky, Z
Mendlovic, D
Ferreira, C
[J]. SECOND IBEROAMERICAN MEETING ON OPTICS, 1996, 2730 : 275 - 283
[36] Design of Nearly-Orthogonal Symmetric Wavelet Filter Banks Based on the Wavelet Orthogonalization Process
Fabrício Ely Gossler
Marco Aparecido Queiroz Duarte
Francisco Villarreal
[J]. Circuits, Systems, and Signal Processing, 2023, 42 : 234 - 254
[37] Design of Nearly-Orthogonal Symmetric Wavelet Filter Banks Based on the Wavelet Orthogonalization Process
Gossler, Fabricio Ely
Queiroz Duarte, Marco Aparecido
Villarreal, Francisco
[J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (01) : 234 - 254
[38] Speech feature extraction based on wavelet modulation scale for robust speech recognition
Ma, Xin
Zhou, Weidong
Ju, Fang
Jiang, Qi
[J]. NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2006, 4233 : 499 - 505
[39] Iris recognition based on wavelet transform with shift invariance preprocessing
Ming, Xing
Liu, Yuanning
Zhu, Xiaodong
Xu, Tao
[J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2006, 43 (07): : 1186 - 1193
[40] Unified frame and segment based models for automatic speech recognition
Hon, HW
Wang, KS
[J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1017 - 1020

← 1 2 3 4 5 →