Noisy speech recognition using de-noised multiresolution analysis acoustic features

被引:10
|
作者
Chan, CP [1 ]
Ching, PC [1 ]
Lee, T [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Elect Engn, Shatin, Hong Kong, Peoples R China
来源
关键词
Cepstral mean normalization - Feature parameters - High frequency bands - Mel-frequency cepstral coefficients - Noisy speech recognition - Novel applications - Robust speech recognition - Wavelet packet filters;
D O I
10.1121/1.1398054
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes a novel application of multiresolution analysis (MRA) in extracting acoustic features that possess de-noising capability for robust speech recognition. The MRA algorithm is used to construct a mel-scaled wavelet packet filter-bank, from which subband powers are computed as the feature parameters for speech recognition. Wiener filtering is applied to a few selected subbands at some intermediate stages of decomposition. For high-frequency bands, Wiener filters are designed based on a reduced fraction of the estimated noise power, making the consonant features much more prominent and contrastive. The proposed method is evaluated in phone recognition experiments with the MIT database. In the presence of stationary white noise at 10-dB SNR, the de-noised MRA features attain a phone recognition rate of 32%. There is a noticeable improvement compared with the accuracy of 29% and 20% attained by the commonly used mel-frequency cepstral coefficients (MFCC) with and without cepstral mean normalization (CMN), respectively. The effectiveness of the MRA features is also verified by the fact that they exhibit smaller distortion from clean speech. (C) 2001 Acoustical Society of America.
引用
收藏
页码:2567 / 2574
页数:8
相关论文
共 50 条
  • [31] End-to-End Noisy Speech Recognition Using Fourier and Hilbert Spectrum Features
    Vazhenina, Daria
    Markov, Konstantin
    ELECTRONICS, 2020, 9 (07) : 1 - 18
  • [32] Speech Recognition using Deep Canonical Correlation Analysis in Noisy Environments
    Isobe, Shinnosuke
    Tamura, Satoshi
    Hayamizu, Satoru
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM), 2021, : 63 - 70
  • [33] Speech Recognition and Acoustic Features in Combined Electric and Acoustic Stimulation
    Yoon, Yang-Soo
    Li, Yongxin
    Fu, Qian-Jie
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2012, 55 (01): : 105 - 124
  • [34] INDEPENDENT COMPONENT ANALYSIS FOR NOISY SPEECH RECOGNITION
    Hsieh, Hsin-Lung
    Chien, Jen-Tzung
    Shinoda, Koichi
    Furui, Sadaoki
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4369 - +
  • [35] Estimation of speech recognition performance in noisy and reverberant environments using PESQ score and acoustic parameters
    Fukumori, Takahiro
    Nakayama, Masato
    Nishiura, Takanobu
    Yamashita, Yoichi
    2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [36] Multi-stream acoustic model adaptation for noisy speech recognition
    Tamura, Satoshi
    Hayamizu, Satoru
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [37] Comparison of Acoustic and Visual Voice Activity Detection for Noisy Speech Recognition
    Bratoszewski, Piotr
    Szwoch, Grzegorz
    Czyzewski, Andrzej
    2016 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2016, : 287 - 291
  • [38] Enhanced Multichannel Histogram Equalization for Speech Recognition in noisy acoustic conditions
    Principi, Emanuele
    Rotili, Rudy
    Squartini, Stefano
    NEURAL NETS WIRN11, 2011, 234 : 149 - 161
  • [39] Machine learning techniques for speech emotion recognition using paralinguistic acoustic features
    Jha T.
    Kavya R.
    Christopher J.
    Arunachalam V.
    International Journal of Speech Technology, 2022, 25 (03): : 707 - 725
  • [40] Deep fusion framework for speech command recognition using acoustic and linguistic features
    Sunakshi Mehra
    Seba Susan
    Multimedia Tools and Applications, 2023, 82 : 38667 - 38691