Filterbank Analysis of MFCC Feature Extraction in Robust Children Speech Recognition

被引：0

作者：

Naing, Hay Mar Soe ^{[1
]}

Miyanaga, Yoshikazu ^{[2
]}

Hidayat, Risanuri ^{[1
]}

Winduratna, Bondhan ^{[1
]}

机构：

[1] Gadjah Mada Univ, Dept Elect Engn & Informat Technol, Yogyakarta 55281, Indonesia

[2] Hokkaido Univ, GS Informat Sci & Techonol, GI CoRE GSB, Sapporo, Hokkaido 0600814, Japan

来源：

2019 INTERNATIONAL SYMPOSIUM ON MULTIMEDIA AND COMMUNICATION TECHNOLOGY (ISMAC) | 2019年

关键词：

children speech recognition; gammatone frequency integration; MFCC; speaker adaptative model;

D O I：

10.1109/ismac.2019.8836181

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper focused on the issue of robustness in children speech recognition system. The shape of filterbank analysis is presented in this study to suppress the additive background noise in acoustic features of children speakers. In addition, the Linear Discriminant Analysis (LDA), the Maximum Likelihood Linear Transform (MLLT) and feature space Maximum Likelihood Linear Regression (fMLLR) features are applied to build the speaker adaptive acoustic model with the help of Kaldi speech recognition toolkit. The performance of Gammatone filterbank and Bark-scale filterbank based Cepstral features were evaluated under contaminated situations using five different types of noise at a range of signal to noise ratio (SNR) 10dB to -10dB. As the detailed analysis shown, the performance of Gammatone frequency integration is superior to Mel Frequency Cepstral Coefficient (MFCC) in different types of additive background noise and various SNR situations.

引用

页数：6

共 50 条

[31] An auditory neural feature extraction method for robust speech recognition
Guo, Wei
Zhang, Liqing
Xia, Bin
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 793 - +
[32] A robust feature extraction for automatic speech recognition in noisy environments
Lima, C
Almeida, LB
Monteiro, JL
2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 540 - 543
[33] Temporal modulation normalization for robust speech feature extraction and recognition
Lu, Xugang
Matsuda, Shigeki
Unoki, Masashi
Nakamura, Satoshi
PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOLS 1-9, 2009, : 4354 - 4357
[34] Study of Robust Feature Extraction Techniques for Speech Recognition System
Sharma, Usha
Maheshkar, Sushila
Mishra, A. N.
2015 1ST INTERNATIONAL CONFERENCE ON FUTURISTIC TRENDS ON COMPUTATIONAL ANALYSIS AND KNOWLEDGE MANAGEMENT (ABLAZE), 2015, : 666 - 670
[35] Robust Feature Extraction for Speech Recognition by Enhancing Auditory Spectrum
Alam, Md Jahangir
Kenny, Patrick
O'Shaughnessy, Douglas
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1358 - 1361
[36] Speech feature extraction based on wavelet modulation scale for robust speech recognition
Ma, Xin
Zhou, Weidong
Ju, Fang
Jiang, Qi
NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2006, 4233 : 499 - 505
[37] Spectral-Temporal Receptive Fields and MFCC Balanced Feature Extraction for Noisy Speech Recognition
Wang, Jia-Ching
Lin, Chang-Hong
Chen, En-Ting
Chang, Pao-Chi
2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
[38] On the Effects of Filterbank Design and Energy Computation on Robust Speech Recognition
Dimitriadis, Dimitrios
Maragos, Petros
Potamianos, Alexandros
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1504 - 1516
[39] The Research of Feature Extraction Based on MFCC for Speaker Recognition
Zhang Wanli
Li Guoxin
2013 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2013, : 1074 - 1077
[40] Selective Gammatone Filterbank Feature for Robust Sound Event Recognition
Leng, Yi Ren
Huy Dat Tran
Kitaoka, Norihide
Li, Haizhou
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2246 - +

← 1 2 3 4 5 →