Robust speech recognition using compression of Mel sub-band energies and temporal filtering

被引:0
|
作者
Moradi, Naghmeh [1 ]
Nasersharif, Babak [1 ,2 ]
Akbari, Ahmad [2 ]
机构
[1] Faculty of Engineering, University of Guilan, Rasht, Iran
[2] Audio and Speech Processing Lab., Computer Engineering Department, Iran University of Science and Technology, Tehran, Iran
关键词
Signal processing - Speech recognition;
D O I
10.1109/ISTEL.2010.5734124
中图分类号
学科分类号
摘要
The Mel-frequency cepstral coefficients (MFCC) are commonly used in speech recognition systems. But, they are highly sensitive to presence of external noise. In this paper, we propose a two-step method to compensate noise effects on MFCC. In the first step, we propose a sub-band SNR-dependent compression function for Mel sub-band energies to give higher weights to sub-bands less contaminated with noise and give lower weights to sub-bands more contaminated with noise. In the second step, we apply temporal filters to the weighted MFCCs in order to improve their temporal characteristics. Our results on Aurora2 databases show that the proposed method has higher performance than both of conventional temporal filtering methods and weighted MFCC. © 2010 IEEE.
引用
收藏
页码:760 / 763
相关论文
共 50 条
  • [1] Mel Sub-Band Filtering and Compression for Robust Speech Recognition
    Nasersharif, Babak
    Akbari, Ahmad
    Homayounpour, Mohammad Mehdi
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 105 - +
  • [2] Mapping Mel Sub-band Energies Using Deep Belief Network for Robust Speech Recognition
    Gholamipour, Mojtaba
    Nasersharif, Babak
    [J]. 2016 8TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2016, : 510 - 514
  • [3] Sub-band speech recognition
    Primor, D
    Furst-Yust, M
    [J]. 22ND CONVENTION OF ELECTRICAL AND ELECTRONICS ENGINEERS IN ISRAEL, PROCEEDINGS, 2002, : 10 - 12
  • [4] Maximum likelihood sub-band adaptation for robust speech recognition
    Zhu, DL
    Nakamura, S
    Paliwal, KK
    Wang, RH
    [J]. SPEECH COMMUNICATION, 2005, 47 (03) : 243 - 264
  • [5] Sub-band level Histogram Equalization for Robust Speech Recognition
    Joshi, Vikas
    Bilgi, Raghavendra
    Umesh, S.
    Garcia, L.
    Benitez, C.
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1672 - +
  • [6] Sub-band Modulation Spectrum Compensation for Robust Speech Recognition
    Tu, Wen-hsiang
    Huang, Sheng-Yuan
    Hung, Jeih-weih
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 261 - 265
  • [7] High robust watermarking technique using sub-band filtering
    Hsia, SC
    Jou, IC
    [J]. 10TH INTERNATIONAL MULTIMEDIA MODELLING CONFERENCE, PROCEEDINGS, 2004, : 72 - 78
  • [8] Modeling sub-band correlation for noise-robust speech recognition
    McAuley, J
    Ming, J
    Hanna, P
    Stewart, D
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 1017 - 1020
  • [9] A probabilistic union model for sub-band based robust speech recognition
    Ming, J
    Smith, FJ
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1787 - 1790
  • [10] Sub-band weighted projection measure for sub-band speech recognition in noise
    Nasersharif, B.
    Akbari, A.
    [J]. ELECTRONICS LETTERS, 2006, 42 (14) : 829 - 831