A new representation for speech frame recognition based on redundant wavelet filter banks

被引:8
|
作者
Tohidypour, Hamid Reza [1 ]
Seyyedsalehi, Seyyed Ali [1 ]
Behbood, Hossein [1 ]
Roshandel, Hossein [2 ]
机构
[1] Amirkabir Univ Technol, Dept Biomed Engn, Tehran 158754413, Iran
[2] Amirkabir Univ Technol, Dept Elect Engn, Tehran Polytech, Tehran 158754413, Iran
关键词
Redundant wavelet filter-bank (RWFB); Wavelet transform (WT); Speech frame recognition; Representation; Frame wavelet; Zero moments; Four-channel higher density discrete wavelet; Time delay neural network (TDNN); TRANSFORM;
D O I
10.1016/j.specom.2011.09.001
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Although the conventional wavelet transform possesses multi-resolution properties, it is not optimized for speech recognition systems. It suffers from lower performance compared with Mel Frequency Cepstral Coefficients (MFCCs) in which Mel scale is based on human auditory perception. In this paper, some new speech representations based on redundant wavelet filter-banks (RWFB) are proposed. RWFB parameters are much less shift-sensitive than those of critically sampled discrete wavelet transform (DWT), so they seem to feature better performance in speech recognition tasks because of having better time-frequency localization ability. However, the improvement is at the expense of higher redundancy. In this paper, some types of wavelet representations are introduced, including a combination of critically sampled DWT and some different multi-channel redundant filter-banks down-sampled by 2. In order to find appropriate filter values for multi-channel filter-banks, effects of changing the zero moments of proposed wavelet are discussed. The corresponding method performances are compared in a phoneme recognition task using time delay neural networks. It is revealed that redundant multi-channel wavelet filter-banks work better than conventional DWT in speech recognition systems. The proposed four-channel higher density discrete wavelet filter-bank results in up to approximately 8.95% recognition rate increase, compared with critically sampled two-channel wavelet filter-bank. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:256 / 271
页数:16
相关论文
共 50 条
  • [1] Speech frame recognition based on less shift sensitive wavelet filter banks
    Tohidypour, Hamid Reza
    Banitalebi-Dehkordi, Amin
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2016, 10 (04) : 633 - 637
  • [2] Speech frame recognition based on less shift sensitive wavelet filter banks
    Hamid Reza Tohidypour
    Amin Banitalebi-Dehkordi
    [J]. Signal, Image and Video Processing, 2016, 10 : 633 - 637
  • [3] New features for speech enhancement using bivariate shrinkage based on redundant wavelet filter-banks
    Tohidypour, Hamid Reza
    Ahadi, Seyed Mohammad
    [J]. COMPUTER SPEECH AND LANGUAGE, 2016, 35 : 93 - 115
  • [4] Frame analysis of wavelet type filter banks
    Stanhill, D
    Zeevi, YY
    [J]. 1996 IEEE DIGITAL SIGNAL PROCESSING WORKSHOP, PROCEEDINGS, 1996, : 435 - 438
  • [5] Sparse Wavelet Decomposition and Filter Banks with CNN Deep Learning for Speech Recognition
    Dai, Jingzhao
    Zhang, Yaan
    Hou, Jintao
    Wang, Xiewen
    Tan, Lizhe
    Jiang, Jean
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY (EIT), 2019, : 98 - 103
  • [6] Speech recognition based on auditory wavelet packet filter
    Zhang, XY
    Jiao, ZP
    [J]. 2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 695 - 698
  • [7] Frame analysis of wavelet-type filter banks
    Stanhill, D
    Zeevi, YY
    [J]. SIGNAL PROCESSING, 1998, 67 (02) : 125 - 139
  • [8] Method of virtual components for constructing redundant filter banks and wavelet frames
    Lai, Ming-Jun
    Petukhov, Alexander
    [J]. APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2007, 22 (03) : 304 - 318
  • [9] TEXTURE-BASED FINGERPRINT RECOGNITION COMBINING DIRECTIONAL FILTER BANKS AND WAVELET
    Li, Chaorong
    Fu, Bo
    Li, Jianping
    Yang, Xingchun
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2012, 26 (04)
  • [10] A new feature in speech recognition based on wavelet transform
    Hao, Y
    Zhu, XY
    [J]. 2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 1526 - 1529