Dereverberation based on Wavelet Packet Filtering for Robust Automatic Speech Recognition

被引:0
|
作者
Gomez, Randy [1 ]
Kawahara, Tatsuya [1 ]
机构
[1] Kyoto Univ, ACCMS, Sakyo Ku, Kyoto 6068501, Japan
关键词
Speech recognition; Robustness; Dereverberation; Wavelet Packets;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a multiple-resolution signal analysis to suppress late reflection of reverberation for robust automatic speech recognition (ASR). Wavelet packet tree (WPT) decomposition offers a finer resolution to discriminate the late reflection subspace from the speech subspace. By selecting appropriate wavelet basis in the WPT for speech and late reflection, we can effectively estimate the Wiener gain directly from the observed reverberant data. Moreover, the selection procedure is performed in accordance with the likelihood of acoustic model used by the speech recognizer. Dereverberation is realized by filtering the wavelet packet coefficients with the Wiener gain to suppress the effects of the late reflection. Experimental evaluations with large vocabulary continuous speech recognition (LVCSR) in real reverberant conditions show that the proposed method outperforms conventional wavelet-based methods and other dereverberation techniques.
引用
收藏
页码:1242 / 1245
页数:4
相关论文
共 50 条
  • [31] Study on the dereverberation of speech based on temporal envelope filtering
    Avendano, C
    Hermansky, H
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 889 - 892
  • [32] Robust features for speech recognition based on admissible wavelet packets
    Farooq, O
    Datta, S
    [J]. ELECTRONICS LETTERS, 2001, 37 (25) : 1554 - 1556
  • [33] Speech feature extraction based on wavelet modulation scale for robust speech recognition
    Ma, Xin
    Zhou, Weidong
    Ju, Fang
    Jiang, Qi
    [J]. NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2006, 4233 : 499 - 505
  • [34] Robust Speech Dereverberation Based on WPE and Deep Learning
    Li, Hao
    Zhang, Xueliang
    Gao, Guanglai
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 52 - 56
  • [35] A Robust Speech Endpoint Detection Algorithm Based on Wavelet Packet and Energy Entropy
    Zhang, Ting
    Huang, Hua
    He, Ling
    Lech, Margaret
    [J]. 2013 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2013, : 1050 - 1054
  • [36] Guided spectrogram filtering for speech dereverberation
    Zheng, Chengshi
    Tan, Zheng-Hua
    Peng, Renhua
    Li, Xiaodong
    [J]. APPLIED ACOUSTICS, 2018, 134 : 154 - 159
  • [37] Speech Recognition Based on Wavelet Packet Transform and K-L Expansion
    Wang, Xu
    Han, Zhiyan
    Wang, Han
    Ma, Yujuan
    [J]. 2008 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-11, 2008, : 2490 - 2493
  • [38] Speech Emotion Recognition Based on LSTM and Mel Scale Wavelet Packet Decomposition
    Feng, Tian
    Yang, Shuying
    [J]. 2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE (ACAI 2018), 2018,
  • [39] MODEL-BASED DEREVERBERATION IN THE LOGMELSPEC DOMAIN FOR ROBUST DISTANT-TALKING SPEECH RECOGNITION
    Sehr, Armin
    Maas, Roland
    Kellermann, Walter
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4298 - 4301
  • [40] ROBUST EXCITATION-BASED FEATURES FOR AUTOMATIC SPEECH RECOGNITION
    Drugman, Thomas
    Stylianou, Yannis
    Chen, Langzhou
    Chen, Xie
    Gales, Mark J. F.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4664 - 4668