Morphological filtering of spectrograms for automatic speech recognition

被引:0
|
作者
Liu, WM [1 ]
Bastante, VJR [1 ]
Rodriguez, FR [1 ]
Evans, NWD [1 ]
Mason, JSD [1 ]
机构
[1] Univ Coll Swansea, Sch Engn, Swansea, W Glam, Wales
关键词
ASR (automatic speech recognition); segmentation; morphological filtering;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper examines the separation of speech signals from additive noise using a recently proposed signal, noise segmentation approach based on statistical properties of the spectrogram [1,2]. Competitive ASR results were reported in [3] despite using only crude spectrogram shape information suggesting that the approach offers high reliability in identifying regions of different signal dominance and might be robust down to negative SNRs. This paper extends these early results in two directions. First extension investigates the contribution of spectrogram shapes plus magnitudes versus shapes alone, the same ASR experiments as in [3] are repeated but this time with magnitude information recovered in regions deemed to contain speech. Results show consistent improvement for all SNRs down to -5dB. Second extension relates to computational efficiency, a modified one-pass version of the originally iterative process is proposed by deducing empirically an optimal final stopping condition for each SNR. This is found to reduce computational time significantly (factors ranging from 7 to 18) whilst improving ASR accuracy.
引用
收藏
页码:546 / 549
页数:4
相关论文
共 50 条
  • [1] Auditory-Inspired Morphological Processing of Speech Spectrograms: Applications in Automatic Speech Recognition and Speech Enhancement
    Cadore, Joyner
    Valverde-Albacete, Francisco J.
    Gallardo-Antolin, Ascension
    Pelaez-Moreno, Carmen
    COGNITIVE COMPUTATION, 2013, 5 (04) : 426 - 441
  • [2] Auditory-Inspired Morphological Processing of Speech Spectrograms: Applications in Automatic Speech Recognition and Speech Enhancement
    Joyner Cadore
    Francisco J. Valverde-Albacete
    Ascensión Gallardo-Antolín
    Carmen Peláez-Moreno
    Cognitive Computation, 2013, 5 : 426 - 441
  • [3] Recognition of speech spectrograms
    Greene, B.G.
    Pisoni, D.B.
    Carrell, T.D.
    1600, (76):
  • [4] RECOGNITION OF SPEECH SPECTROGRAMS
    GREENE, BG
    PISONI, DB
    CARRELL, TD
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1984, 76 (01): : 32 - 43
  • [5] Morphological Processing of Spectrograms for Speech Enhancement
    Cadore, Joyner
    Gallardo-Antolin, Ascension
    Pelaez-Moreno, Carmen
    ADVANCES IN NONLINEAR SPEECH PROCESSING, 2011, 7015 : 224 - 231
  • [6] Spectrograms Fusion-based End-to-end Robust Automatic Speech Recognition
    Shi, Hao
    Wang, Longbiao
    Li, Sheng
    Fang, Cunhang
    Dang, Jianwu
    Kawahara, Tatsuya
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 438 - 442
  • [7] Denoising Using Optimized Wavelet Filtering for Automatic Speech Recognition
    Gomez, Randy
    Kawahara, Tatsuya
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1684 - 1687
  • [8] Dereverberation based on Wavelet Packet Filtering for Robust Automatic Speech Recognition
    Gomez, Randy
    Kawahara, Tatsuya
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1242 - 1245
  • [9] Neuro-fuzzy filtering techniques for automatic speech recognition enhancement
    Poluzzi, R
    Arnone, L
    Savi, A
    Brescianini, M
    2003 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING, PROCEEDINGS: FROM CLASSICAL MEASUREMENT TO COMPUTING WITH PERCEPTIONS, 2003, : 255 - 258
  • [10] Visual recognition training of older adults with speech spectrograms
    Drummond, SS
    Dancer, J
    Casey, BE
    OSullivan, P
    PERCEPTUAL AND MOTOR SKILLS, 1996, 82 (02) : 379 - 382