Filterbank Analysis of MFCC Feature Extraction in Robust Children Speech Recognition

被引：0

作者：

Naing, Hay Mar Soe ^{[1
]}

Miyanaga, Yoshikazu ^{[2
]}

Hidayat, Risanuri ^{[1
]}

Winduratna, Bondhan ^{[1
]}

机构：

[1] Gadjah Mada Univ, Dept Elect Engn & Informat Technol, Yogyakarta 55281, Indonesia

[2] Hokkaido Univ, GS Informat Sci & Techonol, GI CoRE GSB, Sapporo, Hokkaido 0600814, Japan

来源：

2019 INTERNATIONAL SYMPOSIUM ON MULTIMEDIA AND COMMUNICATION TECHNOLOGY (ISMAC) | 2019年

关键词：

children speech recognition; gammatone frequency integration; MFCC; speaker adaptative model;

D O I：

10.1109/ismac.2019.8836181

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper focused on the issue of robustness in children speech recognition system. The shape of filterbank analysis is presented in this study to suppress the additive background noise in acoustic features of children speakers. In addition, the Linear Discriminant Analysis (LDA), the Maximum Likelihood Linear Transform (MLLT) and feature space Maximum Likelihood Linear Regression (fMLLR) features are applied to build the speaker adaptive acoustic model with the help of Kaldi speech recognition toolkit. The performance of Gammatone filterbank and Bark-scale filterbank based Cepstral features were evaluated under contaminated situations using five different types of noise at a range of signal to noise ratio (SNR) 10dB to -10dB. As the detailed analysis shown, the performance of Gammatone frequency integration is superior to Mel Frequency Cepstral Coefficient (MFCC) in different types of additive background noise and various SNR situations.

引用

页数：6

共 50 条

[41] Modified Filterbank Analysis Features for Speech Recognition
Eringis, Deividas
Tamulevicius, Gintautas
BALTIC JOURNAL OF MODERN COMPUTING, 2015, 3 (01): : 29 - 42
[42] Spectral-temporal receptive fields and MFCC balanced feature extraction for robust speaker recognition
Wang, Jia-Ching
Wang, Chien-Yao
Chin, Yu-Hao
Liu, Yu-Ting
Chen, En-Ting
Chang, Pao-Chi
MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (03) : 4055 - 4068
[43] Spectral-temporal receptive fields and MFCC balanced feature extraction for robust speaker recognition
Jia-Ching Wang
Chien-Yao Wang
Yu-Hao Chin
Yu-Ting Liu
En-Ting Chen
Pao-Chi Chang
Multimedia Tools and Applications, 2017, 76 : 4055 - 4068
[44] Speech Recognizer-Based Non-Uniform Spectral Compression for Robust MFCC Feature Extraction
Ali, Bagher Baba
Wojcik, Waldemar
Mamyrbayev, Orken
Turdalyuly, Mussa
Mekebayev, Nurbapa
PRZEGLAD ELEKTROTECHNICZNY, 2018, 94 (06): : 90 - 93
[45] Robust speech recognition method based on discriminative environment feature extraction
Han, JQ
Gao, W
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2001, 16 (05) : 458 - 464
[46] Robust endpoint detection for speech recognition based on discriminative feature extraction
Yamamoto, Koichi
Jabloun, Firas
Reinhard, Klaus
Kawamura, Akinori
2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 805 - 808
[47] Wavelet-based denoising for robust feature extraction for speech recognition
Farooq, O
Datta, S
ELECTRONICS LETTERS, 2003, 39 (01) : 163 - 165
[48] Robust Speech Recognition Method Based on Discriminative Environment Feature Extraction
韩纪庆
高文
Journal of Computer Science & Technology, 2001, (05) : 458 - 464
[49] Robust Feature Extraction for Speech Recognition Based on Perceptually Motivated MUSIC
Han Zhi-yan
Wang Jian
PROCEEDINGS 2010 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, (ICCSIT 2010), VOL 1, 2010, : 98 - 102
[50] Synchrony-Based Feature Extraction for Robust Automatic Speech Recognition
de-La-Calle-Silos, Fernando
Stern, Richard M.
IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (08) : 1158 - 1162

← 1 2 3 4 5 →