Filterbank Analysis of MFCC Feature Extraction in Robust Children Speech Recognition

被引:0
|
作者
Naing, Hay Mar Soe [1 ]
Miyanaga, Yoshikazu [2 ]
Hidayat, Risanuri [1 ]
Winduratna, Bondhan [1 ]
机构
[1] Gadjah Mada Univ, Dept Elect Engn & Informat Technol, Yogyakarta 55281, Indonesia
[2] Hokkaido Univ, GS Informat Sci & Techonol, GI CoRE GSB, Sapporo, Hokkaido 0600814, Japan
关键词
children speech recognition; gammatone frequency integration; MFCC; speaker adaptative model;
D O I
10.1109/ismac.2019.8836181
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper focused on the issue of robustness in children speech recognition system. The shape of filterbank analysis is presented in this study to suppress the additive background noise in acoustic features of children speakers. In addition, the Linear Discriminant Analysis (LDA), the Maximum Likelihood Linear Transform (MLLT) and feature space Maximum Likelihood Linear Regression (fMLLR) features are applied to build the speaker adaptive acoustic model with the help of Kaldi speech recognition toolkit. The performance of Gammatone filterbank and Bark-scale filterbank based Cepstral features were evaluated under contaminated situations using five different types of noise at a range of signal to noise ratio (SNR) 10dB to -10dB. As the detailed analysis shown, the performance of Gammatone frequency integration is superior to Mel Frequency Cepstral Coefficient (MFCC) in different types of additive background noise and various SNR situations.
引用
收藏
页数:6
相关论文
共 50 条
  • [11] Non-Negative Subspace Projection During Conventional MFCC Feature Extraction for Noise Robust Speech Recognition
    Kumar, D. S. Pavan
    Bilgi, Raghavendra R.
    Umesh, S.
    2013 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2013,
  • [12] FEATURE EXTRACTION WITH A MULTISCALE MODULATION ANALYSIS FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Mueller, Florian
    Mertins, Alfred
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7427 - 7431
  • [13] Gabor Filterbank Features for Robust Speech Recognition
    Missaoui, Ibrahim
    Lachiri, Zied
    IMAGE AND SIGNAL PROCESSING, ICISP 2014, 2014, 8509 : 665 - 671
  • [14] MVDR based feature extraction for robust speech recognition
    Dharanipragada, S
    Rao, BD
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 309 - 312
  • [15] Modified feature extraction methods in robust speech recognition
    Rajnoha, Josef
    Pollak, Petr
    2007 17TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA, VOLS 1 AND 2, 2007, : 337 - +
  • [16] Discriminative temporal feature extraction for robust speech recognition
    Shen, JL
    ELECTRONICS LETTERS, 1997, 33 (19) : 1598 - 1600
  • [17] Distinctive phonetic feature extraction for robust speech recognition
    Fukuda, T
    Yamamoto, W
    Nitta, T
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 25 - 28
  • [18] Chip design of MFCC extraction for speech recognition
    Wang, JC
    Wang, JF
    Weng, YS
    INTEGRATION-THE VLSI JOURNAL, 2002, 32 (1-2) : 111 - 131
  • [19] Combining speech enhancement and auditory feature extraction for robust speech recognition
    Kleinschmidt, M
    Tchorz, J
    Kollmeier, B
    SPEECH COMMUNICATION, 2001, 34 (1-2) : 75 - 91
  • [20] An efficient MFCC extraction method in speech recognition
    Han, Wei
    Chan, Cheong-Fat
    Choy, Chiu-Sing
    Pun, Kong-Pang
    2006 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, PROCEEDINGS, 2006, : 145 - +