Morphological filtering of spectrograms for automatic speech recognition

被引:0
|
作者
Liu, WM [1 ]
Bastante, VJR [1 ]
Rodriguez, FR [1 ]
Evans, NWD [1 ]
Mason, JSD [1 ]
机构
[1] Univ Coll Swansea, Sch Engn, Swansea, W Glam, Wales
关键词
ASR (automatic speech recognition); segmentation; morphological filtering;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper examines the separation of speech signals from additive noise using a recently proposed signal, noise segmentation approach based on statistical properties of the spectrogram [1,2]. Competitive ASR results were reported in [3] despite using only crude spectrogram shape information suggesting that the approach offers high reliability in identifying regions of different signal dominance and might be robust down to negative SNRs. This paper extends these early results in two directions. First extension investigates the contribution of spectrogram shapes plus magnitudes versus shapes alone, the same ASR experiments as in [3] are repeated but this time with magnitude information recovered in regions deemed to contain speech. Results show consistent improvement for all SNRs down to -5dB. Second extension relates to computational efficiency, a modified one-pass version of the originally iterative process is proposed by deducing empirically an optimal final stopping condition for each SNR. This is found to reduce computational time significantly (factors ranging from 7 to 18) whilst improving ASR accuracy.
引用
收藏
页码:546 / 549
页数:4
相关论文
共 50 条
  • [21] Emotion recognition from speech using deep learning on spectrograms
    Li, Xingguang
    Song, Wenjun
    Liang, Zonglin
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (03) : 2791 - 2796
  • [22] Iterative Pyramidal Filtering Method for Improved Signal Recognition in Radio Spectrograms
    Lopez-Benitez, Miguel
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2022, 11 (06) : 1146 - 1150
  • [23] Employing Median Filtering to Enhance the Complex-valued Acoustic Spectrograms in Modulation Domain for Noise-robust Speech Recognition
    Hsieh, Hsin-Ju
    Chen, Berlin
    Hung, Jeih-weih
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [24] DATA-FILTERING METHODS FOR SELF-TRAINING OF AUTOMATIC SPEECH RECOGNITION SYSTEMS
    Georgescu, Alexandru-Lucian
    Manolache, Cristian
    Oneata, Dan
    Cucu, Horia
    Burileanu, Corneliu
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 141 - 147
  • [25] Automatic Speech Recognition System for Malay Speaking Children Automatic Speech Recognition system
    Rahman, Feisal Dani
    Mohamed, Noraini
    Mustafa, Mumtaz Begum
    Salim, Siti Salwah
    2014 THIRD ICT INTERNATIONAL STUDENT PROJECT CONFERENCE (ICT-ISPC), 2014, : 79 - 82
  • [26] Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network
    Badshah, Abdul Malik
    Ahmad, Jamil
    Rahim, Nasir
    Baik, Sung Wook
    2017 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON), 2017, : 125 - 129
  • [27] Emotion Recognition from Speech using Spectrograms and Shallow Neural Networks
    Slimi, Anwer
    Hamroun, Mohamed
    Zrigui, Mounir
    Nicolas, Henri
    MOMM 2020: THE 18TH INTERNATIONAL CONFERENCE ON ADVANCES IN MOBILE COMPUTING & MULTIMEDIA, 2020, : 35 - 39
  • [28] Efficient Emotion Recognition from Speech Using Deep Learning on Spectrograms
    Satt, Aharon
    Rozenberg, Shai
    Hoory, Ron
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1089 - 1093
  • [29] Speech Emotion Recognition using Convolutional Recurrent Neural Networks and Spectrograms
    Qamhan, Mustafa A.
    Meftah, Ali H.
    Selouani, Sid-Ahmed
    Alotaibi, Yousef A.
    Zakariah, Mohammed
    Seddiq, Yasser Mohammad
    2020 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2020,
  • [30] AN APPROACH TO THE AUTOMATIC RECOGNITION OF SPEECH
    PAY, BE
    EVANS, CR
    INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1981, 14 (01): : 13 - 27