Filterbank Analysis of MFCC Feature Extraction in Robust Children Speech Recognition

被引:0
|
作者
Naing, Hay Mar Soe [1 ]
Miyanaga, Yoshikazu [2 ]
Hidayat, Risanuri [1 ]
Winduratna, Bondhan [1 ]
机构
[1] Gadjah Mada Univ, Dept Elect Engn & Informat Technol, Yogyakarta 55281, Indonesia
[2] Hokkaido Univ, GS Informat Sci & Techonol, GI CoRE GSB, Sapporo, Hokkaido 0600814, Japan
关键词
children speech recognition; gammatone frequency integration; MFCC; speaker adaptative model;
D O I
10.1109/ismac.2019.8836181
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper focused on the issue of robustness in children speech recognition system. The shape of filterbank analysis is presented in this study to suppress the additive background noise in acoustic features of children speakers. In addition, the Linear Discriminant Analysis (LDA), the Maximum Likelihood Linear Transform (MLLT) and feature space Maximum Likelihood Linear Regression (fMLLR) features are applied to build the speaker adaptive acoustic model with the help of Kaldi speech recognition toolkit. The performance of Gammatone filterbank and Bark-scale filterbank based Cepstral features were evaluated under contaminated situations using five different types of noise at a range of signal to noise ratio (SNR) 10dB to -10dB. As the detailed analysis shown, the performance of Gammatone frequency integration is superior to Mel Frequency Cepstral Coefficient (MFCC) in different types of additive background noise and various SNR situations.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Independent component analysis applied to feature extraction for robust automatic speech recognition
    Potamitis, L
    Fakotakis, N
    Kokkinakis, G
    ELECTRONICS LETTERS, 2000, 36 (23) : 1977 - 1978
  • [22] Hierarchical Speech Recognition System Using MFCC Feature Extraction and Dynamic Spiking RSOM
    Tarek, Behi
    Najet, Arous
    Noureddine, Ellouze
    2014 15TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2014, : 41 - 46
  • [23] Robust analysis and weighting on MFCC components for speech recognition and speaker identification
    Zhou, Xi
    Fu, Yun
    Liu, Ming
    Hasegawa-Johnson, Mark
    Huang, Thomas S.
    2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 188 - 191
  • [24] Robust Feature Extraction Methods for Speech Recognition in Noisy Environments
    Mukheolkar, Ajinkya Sunil
    Alex, John Sahaya Rani
    2014 FIRST INTERNATIONAL CONFERENCE ON NETWORKS & SOFT COMPUTING (ICNSC), 2014, : 295 - 299
  • [25] A bio-inspired feature extraction for robust speech recognition
    Zouhir, Youssef
    Ouni, Kais
    SPRINGERPLUS, 2014, 3
  • [26] Temporal modulation normalization for robust speech feature extraction and recognition
    Lu, Xugang
    Matsuda, Shigeki
    Unoki, Masashi
    Nakamura, Satoshi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2011, 52 (01) : 187 - 199
  • [27] Temporal modulation normalization for robust speech feature extraction and recognition
    Xugang Lu
    Shigeki Matsuda
    Masashi Unoki
    Satoshi Nakamura
    Multimedia Tools and Applications, 2011, 52 : 187 - 199
  • [28] Physiologically Motivated Feature Extraction for Robust Automatic Speech Recognition
    Missaoui, Ibrahim
    Lachiri, Zied
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (04) : 297 - 301
  • [29] A Correlational Discriminant Approach to Feature Extraction for Robust Speech Recognition
    Tomar, Vikrant Singh
    Rose, Richard C.
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 554 - 557
  • [30] Feature extraction based on auditory representations for robust speech recognition
    Kim, DS
    Lee, SY
    Kil, RM
    Zhu, XL
    ELECTRONICS LETTERS, 1997, 33 (01) : 15 - 16