Filterbank Analysis of MFCC Feature Extraction in Robust Children Speech Recognition

被引:0
|
作者
Naing, Hay Mar Soe [1 ]
Miyanaga, Yoshikazu [2 ]
Hidayat, Risanuri [1 ]
Winduratna, Bondhan [1 ]
机构
[1] Gadjah Mada Univ, Dept Elect Engn & Informat Technol, Yogyakarta 55281, Indonesia
[2] Hokkaido Univ, GS Informat Sci & Techonol, GI CoRE GSB, Sapporo, Hokkaido 0600814, Japan
关键词
children speech recognition; gammatone frequency integration; MFCC; speaker adaptative model;
D O I
10.1109/ismac.2019.8836181
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper focused on the issue of robustness in children speech recognition system. The shape of filterbank analysis is presented in this study to suppress the additive background noise in acoustic features of children speakers. In addition, the Linear Discriminant Analysis (LDA), the Maximum Likelihood Linear Transform (MLLT) and feature space Maximum Likelihood Linear Regression (fMLLR) features are applied to build the speaker adaptive acoustic model with the help of Kaldi speech recognition toolkit. The performance of Gammatone filterbank and Bark-scale filterbank based Cepstral features were evaluated under contaminated situations using five different types of noise at a range of signal to noise ratio (SNR) 10dB to -10dB. As the detailed analysis shown, the performance of Gammatone frequency integration is superior to Mel Frequency Cepstral Coefficient (MFCC) in different types of additive background noise and various SNR situations.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Modified Filterbank Analysis Features for Speech Recognition
    Eringis, Deividas
    Tamulevicius, Gintautas
    BALTIC JOURNAL OF MODERN COMPUTING, 2015, 3 (01): : 29 - 42
  • [42] Spectral-temporal receptive fields and MFCC balanced feature extraction for robust speaker recognition
    Wang, Jia-Ching
    Wang, Chien-Yao
    Chin, Yu-Hao
    Liu, Yu-Ting
    Chen, En-Ting
    Chang, Pao-Chi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (03) : 4055 - 4068
  • [43] Spectral-temporal receptive fields and MFCC balanced feature extraction for robust speaker recognition
    Jia-Ching Wang
    Chien-Yao Wang
    Yu-Hao Chin
    Yu-Ting Liu
    En-Ting Chen
    Pao-Chi Chang
    Multimedia Tools and Applications, 2017, 76 : 4055 - 4068
  • [44] Speech Recognizer-Based Non-Uniform Spectral Compression for Robust MFCC Feature Extraction
    Ali, Bagher Baba
    Wojcik, Waldemar
    Mamyrbayev, Orken
    Turdalyuly, Mussa
    Mekebayev, Nurbapa
    PRZEGLAD ELEKTROTECHNICZNY, 2018, 94 (06): : 90 - 93
  • [45] Robust speech recognition method based on discriminative environment feature extraction
    Han, JQ
    Gao, W
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2001, 16 (05) : 458 - 464
  • [46] Robust endpoint detection for speech recognition based on discriminative feature extraction
    Yamamoto, Koichi
    Jabloun, Firas
    Reinhard, Klaus
    Kawamura, Akinori
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 805 - 808
  • [47] Wavelet-based denoising for robust feature extraction for speech recognition
    Farooq, O
    Datta, S
    ELECTRONICS LETTERS, 2003, 39 (01) : 163 - 165
  • [48] Robust Speech Recognition Method Based on Discriminative Environment Feature Extraction
    韩纪庆
    高文
    Journal of Computer Science & Technology, 2001, (05) : 458 - 464
  • [49] Robust Feature Extraction for Speech Recognition Based on Perceptually Motivated MUSIC
    Han Zhi-yan
    Wang Jian
    PROCEEDINGS 2010 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, (ICCSIT 2010), VOL 1, 2010, : 98 - 102
  • [50] Synchrony-Based Feature Extraction for Robust Automatic Speech Recognition
    de-La-Calle-Silos, Fernando
    Stern, Richard M.
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (08) : 1158 - 1162